Why cant we access dynamic data in unstable_cache?

Unanswered

Paper wasp posted this in #help-forum

Paper waspOP

2024-06-14T02:21:34.630Z

We ran into this problem in our app that boils down to this.

1. We store Authorization data in the headers() of a request.
2. We want to check Authorization immediately before querying data in our data-layer.

And I can see how in most cases this can be done simply by calling the headers outside of the unstable_cache call.

But in our use case we also do some heavy computations on the data before returning it to the client. What we really want is to be able to add the computational result to the cache as well so we dont have to re-compute the result each time. Additionally this result can be shared between users, even though we need to check they are allowed to create it in the first place, once we have checked that many other people can view the cached result.

The unstable_cache is perfect for this because we can craft the cache-key ourselves to limit access to users who need it but it is done at a very high level. If we were to call headers() or cookies() at the time we want to cache we would end up needing to propagate that data very far down the function execution in function params.

It seems arbitrary to me why this restriction is in place?

I understand that if you cache headers() (or rather cache the result of the function that uses headers) you could leak request context into other requests... But shouldnt this be left up to "us" as framework users to decide? unless there is some other reason this is bad that I am unaware of...

22 Replies

@Paper wasp We ran into this problem in our app that boils down to this. 1. We store Authorization data in the `headers()` of a request. 2. We want to check Authorization immediately before querying data in our `data-layer`. And I can see how in most cases this can be done simply by calling the headers outside of the unstable_cache call. But in our use case we also do some heavy computations on the data before returning it to the client. What we really want is to be able to add the computational result to the cache as well so we dont have to re-compute the result each time. Additionally this result can be shared between users, even though we need to check they are allowed to create it in the first place, once we have checked that many other people can view the cached result. The unstable_cache is perfect for this because we can craft the cache-key ourselves to limit access to users who need it but it is done at a very high level. If we were to call headers() or cookies() at the time we want to cache we would end up needing to propagate that data very far down the function execution in function params. It seems arbitrary to me why this restriction is in place? I understand that if you cache `headers()` (or rather cache the result of the function that uses headers) you could leak request context into other requests... But shouldnt this be left up to "us" as framework users to decide? unless there is some other reason this is bad that I am unaware of...

joulev

2024-06-14T02:24:51.586Z

headers() is dependent on the request. so it is impossible to be cached, because different requests have different headers so if you cached the value, it will become inaccurate for at least one of the requests.

this works though:

const getData = unstable_cache(
  async (foo) => ...,
  ...
);

const foo = headers().get("foo");
await getData(foo);

Paper waspOP

2024-06-14T02:35:45.265Z

That is nice in theory but in practise we have been following the guidelines (Which I like) in using the data-layer to do authorization.

It seems to me like this is also just as valid.

function complexCalculation() {
    const calc1 = nestedCalc1()
    const calc2 = nestedCalc2()
    return nestedCalc3(calc1, calc2)
}

const cacheKey = getCacheKey(headers())
const getData = unstable_cache(
  async () => complexCalculation()
  cacheKey
);

await getData();

Because if I now have to add the headers to the unstable_cache call then this becomes

function complexCalculation(foo) {
    const calc1 = nestedCalc1(foo)
    const calc2 = nestedCalc2(foo)
    return nestedCalc3(foo, calc1, calc2)
}

const getData = unstable_cache(
  async (foo) => complexCalculation(foo)
  cacheKey
);
const foo = headers().get("foo")
await getData(foo);

In particular this boils down to in the data-layer something like this

function getAuthorizedData() {
    const auth = headers().get("auth")
    return db.select(data).where(eq(data.id, auth.id))
}

Turns into something like this...

function getAuthorizedData(auth) {
    return db.select(data).where(eq(data.id, auth.id))
}

Which removes one of the biggest draw cards of a data access layer in the first place... co-locating your auth and data access

Paper waspOP

2024-06-14T02:48:53.315Z

Also in our case many different headers() results would have the same result. So the cached value would be correct for a variety of different headers() results. If a request were to have headers that meant a different result was needed then we would produce a different cache_key for those headers.

joulev

2024-06-14T03:00:13.886Z

It seems to me like this is also just as valid.

function complexCalculation() {
const calc1 = nestedCalc1()
const calc2 = nestedCalc2()
return nestedCalc3(calc1, calc2)
}
const cacheKey = getCacheKey(headers())
const getData = unstable_cache(
async () => complexCalculation()
cacheKey
);
await getData();

yes because it is! as long as you don't run headers() inside the unstable_cache body, it's fine

the value of headers() can be used to compute the cache key to check for the cache, like you did, normally

Paper waspOP

2024-06-14T03:10:58.624Z

Ah I should add that at each step of these nested ‘nestedCalc’ calls the headers() are checked for data-layer db access

@Paper wasp Ah I should add that at each step of these nested ‘nestedCalc’ calls the headers() are checked for data-layer db access

joulev

2024-06-14T03:14:51.297Z

then no, thats not possible for the reason i already explained above.

headers() is dependent on the request. so it is impossible to be cached

headers().get("foo") is not request-specific (several requests can share the same value), so it can be cached

headers() itself is request-specific (different requests have different values) so cannot be cached

Paper waspOP

2024-06-14T04:02:41.007Z

Ok so take this example then.

function nestedCalc1() {
    const foo = headers().get("foo")
    return db.select(data).where(eq(data.id, foo))
}

function nestedCalc2() {
    const bar = headers().get("bar")
    return db.select(data).where(eq(data.id, bar))
}

function nestedCalc3(calc1Result, calc2Result) {
    const baz = headers().get("baz")
    // Some very expensive calculation on the results of 1 and 2
}

function complexCalculation() {
    const calc1 = nestedCalc1()
    const calc2 = nestedCalc2()
    return nestedCalc3(calc1, calc2)
}

function getCacheKey() {
    return `${headers().get("foo")}-${headers().get("bar")}-${headers().get("baz")}`
}

const getData = unstable_cache(
  async () => complexCalculation(),
  getCacheKey()
);

await getData();

This code will fail since I am calling headers() in a unstable_cache call but I am doing it in such a way that there is no way that different requests could accidentally access the same data. They CAN share data if their foo bar and baz values are the same but that is what I want since those values are all the calcs depend on.

It seems to me like this could be supported?

joulev

2024-06-14T04:07:10.422Z

Nextjs doesn’t pre-parse your file. unstable_cache is a normal function and simply runs like any other JavaScript functions, it cannot edit itself to accept headers() in the implementation-specific cases like this. So no it is not possible…

… unless you figure out a way to make it possible, in which case a PR is always welcome

So for example, if unstable_cache parses its content first (somehow), then extract all header retrievals outside the cached scope and use them as cache keys automatically, then yes, it is possible. So theoretically speaking it is possible. But is it implemented? No, so only way you can have it is by implementing it yourself and filing a PR.

Paper waspOP

2024-06-14T04:14:31.566Z

ok I get what you're saying and totally understand that is a huge change... But from reading the nextjs implementation of this I see this code.
https://github.com/vercel/next.js/blob/1b93f366fc175f94322f6811b6f4459d42935d79/packages/next/src/server/app-render/dynamic-rendering.ts#L118-L161

And am still just wondering why?
Like why does nextjs have this safeguard in? If someone is using the unstable_cache there are a million other ways that someone can leak the cache context but next seems to be limiting this one specifically.

snippet is too long for discord

like on one hand you "could" somehow hoist all the headers() and cookies() calls in the unstable_cache call... But It seems to be a workaround to a problem that I dont even know exists in the first place.

essentially the docs describe what happens but I dont understand why that happens 😅

@Paper wasp ok I get what you're saying and totally understand that is a huge change... But from reading the nextjs implementation of this I see this code. https://github.com/vercel/next.js/blob/1b93f366fc175f94322f6811b6f4459d42935d79/packages/next/src/server/app-render/dynamic-rendering.ts#L118-L161 And am still just wondering why? Like why does nextjs have this safeguard in? If someone is using the unstable_cache there are a million other ways that someone can leak the cache context but next seems to be limiting this one specifically.

joulev

2024-06-14T04:22:11.594Z

That function is simply a check for whether dynamic functions are run inside a static generation procedure, for the lack of a better word.

Static pages are in a static generation procedure, so an error is thrown to tell users what to do instead.

unstable_cache body is also a static generation procedure so the same thing happens. It’s like you are telling it “this should be both static and dynamic”. That’s impossible, that’s why they throw a user friendly error here rather than let the logic throw some obscure unreadable errors later on.

There’s just that, nothing special here.

Calling headers() itself is a forced dynamic procedure. Inside unstable_cache is a forced static procedure. They simply don’t go together.

Paper waspOP

2024-06-14T04:23:19.414Z

ahhh sweet thankyou. Ill go read up on why unstable_cache is static 😄

joulev

2024-06-14T05:17:54.517Z

static vs dynamic is like this:

* dynamic procedures are guaranteed to be called every request.
* static procedures may or may not be called every request, and whether they run at a particular request must be known before they are run.

in the case of unstable_cache, the callback may or may not run – it is static. whether it runs or it doesn't run is determined by [a lot of conditions](https://github.com/vercel/next.js/blob/a00146e001267b9b294a53eb33ea935f6ee2729c/packages/next/src/server/web/spec-extension/unstable-cache.ts#L93) that are all known in advance, before the callback itself may be run ([here](https://github.com/vercel/next.js/blob/a00146e001267b9b294a53eb33ea935f6ee2729c/packages/next/src/server/web/spec-extension/unstable-cache.ts#L221-L232) or [here](https://github.com/vercel/next.js/blob/a00146e001267b9b294a53eb33ea935f6ee2729c/packages/next/src/server/web/spec-extension/unstable-cache.ts#L259-L269)).

in your case, you are basically asking unstable_cache to cache a callback where whether it should or should not run is determined inside the callback itself. in other words, whether the callback should be run is not known until the callback is run. this is impossible logic, so it is impossible.

it could be possible but only with prior parsing of the callback. if nextjs can statically parse the callback data, determine which header is needed during the run, then either split the callback into several unstable_cache calls separated by dynamic headers() calls, or run the headers() in advance and save it into some sort of context, then it would be possible. but that relies on the ability to statically analyse the javascript ast of the callback that you pass to the function. nextjs doesn't have that, it simply runs the function.

tldr: no, it's impossible. just like humans can't grow wings, unstable_cache can't run headers().

Paper waspOP

2024-06-14T06:43:52.288Z

Whether the unstable_cache should be run or not is known in advance in my example...
1. The cacheKey is derived from params and key-parts
2. If the cache has a hit return.
3. If the cache has a miss run.
Nothing there in the params or key-parts is unknown or determined inside the callback. I could just simply write

unstable_cache(() => headers().get("foo"), ["bar"])

and next would throw an error... Even though the params and key-parts are known ahead of time.
The second time it runs you would hit a cache regardless of your headers() value. Which might be what you want?

If you wanted to revalidate the cache each time your headers value changed you could instead write.

unstable_cache(() => headers().get("foo"), ["bar", headers().get("foo")])

joulev

2024-06-14T07:07:47.881Z

im convinced that i cant convince you. hope someone will be able to explain it to you