Kratos periodic CSRF problem after upgrading from v0.4.6 to v0.5.3

Hi there, I have been using Kratos for quite a while and it was all working properly.

But today after I upgraded it from 0.4.6 to 0.5.3, the CSRF problem just shows up. Kratos complaints about missing CSRF for all flows. And the issue seems to be periodic: for a short period of time it doesn’t work, and then it works after a while. So it’s very hard for me to reproduce this.

I am aware that there is this doc exist. I have double checked that non of those items applies to me.

I am just wondering if there is any breaking changes relating to CSRF handling between these two versions that is not mentioned in the CHANGELOG?

Thanks a lot. :pray::pray::pray:

More tech detail, when it’s breaking

I tried:

http '${KRATOS_ADMIN}/self-service/login/flows?id=${FLOW_ID}'

and then I grabbed the return CSRF token and passed in:

http --form POST ${KRATOS_PUBLIC}/self-service/login/methods/password?flow=${FLOW_ID} \
  csrf_token='THE ABOVE CSRF'  \
  identifier='random' \
  password='random' \
  Host:'' # this is to simulate browser header

It redirects me to the error flow with following log in Kratos:

time=2020-10-31T06:05:15Z level=info msg=Encountered self-service login error. audience=audit error=map[debug: message:The requested action was forbidden reason:A request failed due to a missing or invalid csrf_token value. status:Forbidden status_code:403] ......

Sorry for the late reply - does this problem happen in the browser, an API client (e.g. react native app) or what context exactly?

Sorry for the late reply as I really couldn’t find a way to consistently re-produce this.

To answer your question

does this problem happen in the browser, an API client

It happens in browser flow.

what context exactly

The last time when my customer reported this, I noticed following events:

  • The customer was redirected to login screen because of expired session
  • The customer tried to login again, but no matter what he does the error is always what I showed above.
  • The customer tried to clear cache, using different browser, the problem persist.
  • At the same time, one other customer attempts to login and succeeded.
  • Now after the other customer is logged in, the reporting customer says he is able to login.

There are other situations similar to above.

Any help will be appreciated!

Hm without a reproducible test case it will be very hard to debug this. We need at least some headers, the request URLs, config, set up, and so on. Otherwise we would be doing fortune telling :wink:

Yeah, I totally understand and I appreciate you spend time reading this already :slight_smile:.

In terms of more tech detail:

My initial message contains the minimal procedures that can trigger that error when the issue is happening. The log message is the only information Kratos spit out regarding this, as you can see, it’s very different from normal CSRF mismatching error log.

Maybe, can I ask this question from a different perspective?

I understand that normally when a user complete a login flow, if CSRF mismatches, Kratos will return validation error as opposed to redirecting user to the new error flow (introduced in recent releases).

My question is: under what case, would a CSRF issue trigger error flow above instead of simple validation error?

Your code example in your original post fails because you are not including any cookies in the HTTP request.

:man_facepalming: you are right. My example wasn’t correct. I didn’t realize Kratos needs both form CSRF token and cookie CSRF token.

I started to understand the whole situation on my site, here is what happened. In a browser:

  1. A customer opens a the login page, initialized the Kratos login flow but doesn’t not login.
  2. The customer didn’t do any action within the CSRF Cookie expiry duration. (Or the cookie was removed)
  3. Now the customer tries to login, he will see the error flow.
  4. He clicked browser back button, ended up in /login?flow=xxxx (the previous flow id again) page and tried to login again. This is when he saw the error flow again and was convinced our site is broken.

In this situation, is it the best practice for me to handle the error flow proactively by redirecting user to /login so user can initialize a new login flow and obtaining a new CSRF token?

Hm, I was under the impression that CSRF failures should automagically initialize a new login/registration/… flow, but that might not actually be the case. Could you please create an issue if that is really the case? To test it, just initialize a login flow in a browser window, then copy the URL to a new incognito tab and submit it. You should end up at a new login flow but if not that means there is a bug!

1 Like