Vault
Overview
The Vault 1.16.x upgrade guide contains information on deprecations, important or breaking changes, and remediation recommendations for anyone upgrading from Vault 1.15. Please read carefully.
Important changes
External plugin variables take precedence over system variables
Vault gives precedence to plugin environment variables over system environment variables when loading external plugins. The behavior for builtin plugins and plugins that do not specify additional environment variables is unaffected.
For example, if you register an external plugin with SOURCE=child
in the
env parameter but the main Vault
process already has SOURCE=parent
defined, the plugin process starts
with SOURCE=child
.
Refer to the plugin management page for more details on plugin environment variables.
Avoid conflicts with containerized plugins
Containerized plugins do not inherit system-defined environment variables. As a result, containerized plugins cannot have conflicts with Vault environment variables.
How to opt out
To opt out of the precedence change, set the
VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
environment variable to true
for the
main Vault process:
$ export VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING=true
Setting VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
to true
tells Vault to:
- prioritize environment variables from the Vault server environment whenever the system detects a variable conflict.
- report on plugin variable conflicts during the unseal process by printing warnings for plugins with conflicting environment variables or logging an informational entry when there are no conflicts.
For example, assume you set VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
to true
and have an environment variable SOURCE=parent
.
If you register an external plugin called myplugin
with SOURCE=child
, the
plugin process starts with SOURCE=parent
and Vault reports a conflict for
myplugin
.
LDAP auth login changes
Users cannot log in using LDAP unless the LDAP plugin is configured
with an userdn
value scoped to an organization unit (OU) where the
user resides.
LDAP auth entity alias names no longer include upndomain
The userattr
field on the LDAP auth config is now used as the entity alias.
Prior to 1.16, the LDAP auth method would detect if upndomain
was configured
on the mount and then use <cn>@<upndomain>
as the entity alias value.
The consequence of not configuring this correctly means users may not have the correct policies attached to their tokens when logging in.
How to opt out
To opt out of the entity alias change, update the userattr
field on the config:
userattr="userprincipalname"
Refer to the LDAP auth method (API) page for more details on the configuration.
Secrets Sync now requires setting a one-time flag before use
To use the Secrets Sync feature, the feature must be activated with a new one-time operation called an activation-flag. The feature is gated until a Vault operator decides to trigger the flag. More information can be found in the secrets sync documentation.
Auto-rolled billing start date
As of 1.16.7 and later, the billing start date (license start date if not configured) automatically rolls over to the latest billing year at the end of the last cycle.
This means that the billing start dates from earlier years will get updated to current calendar year, unless the date has not yet occured this year, in which case it will remain in the last calendar year. The billing start date will adjust accordingly for leap years.
The default billing start date used in queries and license utilization reporting will be automatically adjusted to the latest billing year. Past year billing information will still be retained.
For example, assume a customer has signed a deal for 3 years with a license start date of Dec 1, 2023. After upgrading to the version with this change, the billing start date auto-rolls to the most recent year.
- If the upgrade happens on Sept 1, 2024, the billing start date will continue to be Dec 1, 2023. Once we reach Dec 1, 2024, the billing start date will automatically roll to Dec 1, 2024.
- If the upgrade happens on Dec 15, 2024, the billing start date should automatically roll to Dec 1, 2024 after upgrade.
Docker image no longer contains curl
As of 1.16.7 and later, the curl
binary is no longer included in the published Docker container
images for Vault and Vault Enterprise. If your workflow depends on curl
being available in the
container, consider one of the following strategies:
Create a wrapper container image
Use the HashiCorp image as a base image to create a new container image with curl
installed.
FROM hashicorp/vault-enterprise
RUN apk add curl
NOTE: While this is the preferred option it will require managing your own registry and rebuilding new images.
Install it at runtime dynamically
When running the image as root (not recommended), you can install it at runtime dynamically by using the apk
package manager:
docker exec <CONTAINER-ID> apk add curl
kubectl exec -ti <NAME> -- apk add curl
When running the image as non-root without privilege escalation (recommended) you can use existing
tools to install a static binary of curl
into the vault
users home directory:
docker exec <CONTAINER-ID> wget https://github.com/moparisthebest/static-curl/releases/latest/download/curl-amd64 -O /home/vault/curl && chmod +x /home/vault/curl
kubectl exec -ti <NAME> -- wget https://github.com/moparisthebest/static-curl/releases/latest/download/curl-amd64 -O /home/vault/curl && chmod +x /home/vault/curl
NOTE: When using this option you'll want to verify that the static binary comes from a trusted source.
Known issues and workarounds
Client tokens and token accessors audited in plaintext
Affected versions
- 1.16.7, 1.16.8, 1.17.3, 1.17.4
Issue
In versions 1.16.7, 1.16.8, 1.17.3, and 1.17.4 audit logs may contain non-hmac’d values for client_token and accessor data in the response portion. A fix has been created and is released in 1.16.9 and 1.17.5.
Workaround
It is recommended to avoid affected versions when upgrading. If you are on these versions and using the audit logging feature please upgrade promptly to 1.16.9 or 1.17.5.
JWT auth login requires bound audiences on the role
Affected versions
- 1.15.9
- 1.15.10
- 1.15.11
- 1.16.3
- 1.16.4
- 1.16.5
Issue
A behavior change was made in the jwt auth plugin to address CVE-2024-5798. Since the behavior change was a breaking change, we reverted the change in the versions 1.15.12 and 1.16.6 and later. However, the behavior change will go into effect in 1.17.
The new behavior requires that the bound_audiences
parameter of "jwt" roles
must be set and must match at least one of the JWT's associated aud
claims if there are any.
The aud
claim can be a single string or a list of strings as per
RFC 7519 Section 4.1.3.
Users may not be able to log into Vault if the JWT role is configured incorrectly. For additional details, refer to the JWT auth method (API) documentation.
See this issue for more details.
Workaround
Configure the bound_audiences
parameter of "jwt" roles to match at least one
of the JWT's associated aud
claims. This configuratoin will be required for
1.17 and later.
Error configuring the JWT auth method
Affected versions
- 1.16.1
Issue
An error will occur when configuring the built-in jwt auth method. This will affect new mounts and updates to existing mounts. Existing mounts should not encounter an error if no modifications are made.
See this issue for more details.
This issue is addressed in Vault 1.16.2 and later.
Workaround
Do not attempt to update an existing mount's config. New mounts can run the plugin as an external plugin to avoid the error.
Error logging in with LDAP auth method when anonymous group search is enabled
Affected versions
- 1.16.0
Issue
Depending on their LDAP configuration, some customers may encounter an error when attempting to login with ldap auth method when anonymous group search is enabled. See this issue for more details.
This issue was resolved in 1.16.1
.
Workaround
There is no workaround.
Error logging in with LDAP auth method
Affected versions
- 1.16.0
Issue
Depending on their LDAP configuration, some customers may encounter an error when attempting to login with ldap auth method. Active Directory users are affected by this bug. See this issue for more details.
This issue was resolved in 1.16.1
.
Workaround
There is no workaround.
Existing clusters do not show the current Vault version in UI by default
Affected versions
- 1.16+
Issue
Previous versions of the Vault UI fetched version information from
un-authenticated endpoints like sys/health
and sys/seal-status
. Since
introducing the redact_version
TCP listener parameter, version information may
no longer be available under some configurations. As a result, the Vault UI now
uses a new, authenticated endpoint (sys/internal/ui/version
) to fetch
version information.
By default, all new Vault servers created with v1.16+ include the following rule, in the automatically-generated policy:
# Allow a token to look up the Vault version. This path is not subject to
# redaction like the unauthenticated endpoints that provide the Vault version.
path "sys/internal/ui/version" {
capabilities = ["read"]
}
However, the default policy for existing Vault servers does not update automatically during the upgrade. You must update the policy manually in order for the Vault version to be displayed in the Vault UI.
No other functionality in the Vault UI is affected by this issue.
You can use the Vault CLI to update the default policy and allow the Vault UI to query the Vault server for version information:
$ vault policy read default | cat - <<< '
# Allow a token to look up the Vault version. This path is not subject to
# redaction like the unauthenticated endpoints that provide the Vault version.
path "sys/internal/ui/version" {
capabilities = ["read"]
}
' > default-policy.hcl
$ vault policy write default ./default-policy.hcl
Default lease count quota enabled when upgrading from Vault versions before 1.9
Affected versions
- 1.16+
Issue
Vault began tracking version history as of version 1.9. As of version 1.16, new Vault installs automatically enable a lease count quota by consulting the version history. If the version history is empty on upgrade, Vault treats the upgrade as a new install and automatically enables a default lease count quota.
Before you upgrade Vault from a version prior to 1.9 to versions 1.16+,
you should check the current number of unexpired leases via the
vault.expire.num_leases
metric.
If the number of unexpired leases is below the default lease count quota, value of 300000 no extra steps are required.
If the number of unexpired leases is greater than the default threshold of 300000, there is a two step workaround to safely upgrade without the default lease count quota:
- Upgrade to any Vault version prior to 1.16 (between 1.9 and 1.15) to populate the version store.
- Upgrade to Vault version 1.16+.
You can review, modify, and delete the global default quota at any point with
the
/sys/quotas/lease-count/default
endpoint:
$ vault read sys/quotas/lease-count/default
$ vault delete sys/quotas/lease-count/default
$ vault write sys/quotas/lease-count/default max_leases=<# of max leases>
Refer to Protecting Vault with Resource Quotas for a step-by-step tutorial on quota tuning.
Refer to Lease Explosions for more information on lease management.
PKI OCSP GET requests can return HTTP redirect responses
If a base64 encoded OCSP request contains consecutive '/' characters, the GET request will return a 301 permanent redirect response. If the redirection is followed, the request will not decode as it will not be a properly base64 encoded request.
As a workaround, OCSP POST requests can be used which are unaffected.
Impacted versions
Affects all current versions of 1.12.x, 1.13.x, 1.14.x, 1.15.x, 1.16.x, 1.17.x.
Azure secrets engine role creation failing
Affected versions
- 1.16.0, 1.16.1, 1.16.2
Issue
Creating Azure Secrets engine roles by specifying the Azure App Registration
Object ID as the application_object_id causes Vault to error out stating error
loading Application: no application found. This was introduced by
vault-plugin-secrets-azure
v0.17.0, which incorrectly queries for an application
by client_id and not application_object_id.
Workaround
Users can pass a client_id instead of an application_object_id into the application_object_id parameter. However, after upgrading to a version with a fix (1.16.3+), the user will need to remember to switch back to the application_object_id.
Performance Standbys revert to Standby mode on unseal
Affected versions
- 1.14.12
- 1.15.8
- 1.16.2
Issue
If you previously set a value for retention_months
via the
sys/internal/counters/config
endpoint, upgrading to Vault Enterprise versions 1.14.12, 1.15.8, and 1.16.2
will cause Performance Standby
nodes to revert to Standby mode.
Adding nodes with Vault Enterprise versions 1.14.12, 1.15.8, or 1.16.2 to a
cluster with an older versioned leader will see any previously set
retention_months
value and attempt to write the new minimum value of 48
. The
storage write will result in a read-only error:
[ERROR] core: performance standby post-unseal setup failed: error="cannot write to readonly storage"
You can verify the status of your nodes by checking the /sys/health endpoint.
Deployments that rely on scaling across Performance Standbys will now forward all requests to the active node, increasing the utilization of the active node.
Post-upgrade cluster membership
During the last step of a full upgrade, the old leader steps down, causing one of the Standby nodes to become leader.A fix for the read-only storage error has been prioritized and escalated. The fix will be in releases 1.14.13, 1.15.9 and 1.16.3.
Important
If you have already upgraded to versions 1.14.12, 1.15.8, or 1.16.2, please refer to the workaround section for options.Workaround
Once the leader of the cluster has been updgraded to version 1.14.12, 1.15.8, or
1.16.2, the workaround is to update the retention_months
value on the active
node via the
sys/internal/counters/config
endpoint:
$ vault write sys/internal/counters/config retention_months=48
This storage entry will be written to all nodes in the cluster, allowing them to immediately unseal as Performance Standbys.
After the new retention_months
value is written to storage on the active node,
adding new nodes to the cluster will not cause the read-only error.
Sending SIGHUP to vault standby node causes panic
Affected versions
- 1.13.4+
- 1.14.0+
- 1.15.0+
- 1.16.0+
Issue
Sending a SIGHUP to a vault standby node running an enterprise build can cause a panic if there is a change to the license, or reporting configuration. Active and performance standby nodes will perform fine. It is recommended that operators stop and restart vault nodes individually if configuration changes are required.
Workaround
Instead of issuing a SIGHUP, users should stop individual vault nodes, update the configuration or license and then restart the node.
New nodes added by autopilot upgrades provisioned with the wrong version
Affected versions
- 1.15.3 - 1.15.9
- 1.16.1 - 1.16.3
Issue
If autopilot_upgrade_version
is not explicitly set in the Vault configuration file in the storage
section, new non-active nodes will retain their original Vault version as opposed to the new version.
Workaround
Set the desired version in the configuration file as autopilot_upgrade_version=<version string>
. This will
allow all nodes to receive the proper version to upgrade to.
Secrets Sync cannot be activated from chroot namespace
Affected versions
- 1.16.0+
Issue
Secrets Sync cannot be activated from the chroot namespace. The Secrets Sync feature
now requires a new activation-flag to be enabled before it can be used. Writing to
any sys/activation-flags/
path currently requires root namespace access.
Workaround
Users can request a Vault operator to activate the feature from the root namespace if they lack the necessary access.
Potential DoS when using the deny_unauthorized
proxy protocol behavior for a TCP listener
Affected versions
Community Edition (CE)
- 1.10.x - 1.15.x
- 1.17.1
Enterprise
- 1.10.x+ent - 1.15.11+ent
- 1.16.5+ent
- 1.17.1+ent
Issue
Vault TCP listeners configured with the deny_unauthorized
proxy_protocol_behavior
close if they receive a request from an untrusted upstream connection. As a result,
Vault no longer responds to any request received through that listener.
Being able to force-close listeners with intentionally untrusted connections leaves the Vault API vulnerable to a denial-of-service (DoS) attacks, which could cause the API to become unavailable on that node.
The vulnerability is addressed in Vault 1.15.12+ent, 1.16.6+ent, 1.17.2, 1.17.2+ent and later.
Workaround
Do not configure a proxy_protocol_behavior
with the deny_unauthorized
value.
Deleting an entity-aliases does not remove it from the in-memory database on standby nodes
Affected versions
Vault Community Edition
- 1.16.3
- 1.17.0 - 1.17.2
Enterprise
- 1.16.3+ent - 1.16.6+ent
- 1.17.0+ent - 1.17.2+ent
Issue
Entity-aliases deleted from Vault are not removed from the in-memory database on standby nodes within a cluster. As a result, API requests to create a new entity-alias with the same mount accessor and name sent to a standby node will fail.
This bug is fixed in Vault 1.16.7+ent, 1.17.3, 1.17.3+ent and later.
Duplicate identity groups created when concurrent requests sent to the primary and PR secondary cluster
Affected versions
- All Enterprise versions from 0.7.0 up
Issue
Duplicate identity groups are groups that have the same name, and should not be possible in vault. However, there is a race condition in which duplicate groups are possible. If more than 1 create group request is sent at the same time to a Primary Cluster and the PR secondary cluster, it can result in group duplicates.
Workaround
All create group requests should be sent to the primary cluster.
Manual entity merges sent to a PR secondary cluster are not persisted to storage
Affected versions
- Vault Enterprise v0.7.0+
Issue
If you send a manual entity merge to a PR secondary cluster, the change is temporarily reflected on the active node of the PR secondary cluster, but Vault does not save the change to persistent storage.
Workaround
Always send manual entity merge requests to the primary cluster.