High Availability and Scaling

paul · September 21, 2018, 2:27pm

Hi,

I’m trying to find info on scaling and High Availability for (atm) Hydra. The main website suggests that Hydra is designed to scale, but this seems more to be focused on the feature/requirements side than on workload side.

Think I remember reading that Hydra scales well with the resources you throw at it, but nowhere did I find any more specific numbers/guidelines.

So as for scaling, am I correct that the way to scale Hydra is by vertical scaling or can Hydra also be scaled horizontally, by clustering multiple Hydra instances?

Also, could not find any info on High Availability strategies for Hydra? Can Hydra instances be clustered? Or can you just run multiple instances against the same database?

Some info/guidelines would be appreciated.

Some background from our side: we’re contemplating using Hydra in a Kubernetes environment to handle the authentication on the api of our saas application

hackerman · September 21, 2018, 3:02pm

Hydra scales with the datastore. As such, you make it HA if the datastore is HA (eg CloudSpanner, CockroachDB, …). There is no need for an etcd or memcached setup or anything like that. You make it x-region if the datastore is x-region.

Clustering works, and both horizontal and vertical scaling as well.

paul · September 23, 2018, 11:57am

Tnx for the response.

So, to make sure I 100% understand: Hydra Ory itself is completely stateless, all state is stored in the DB, so you can run as many instances of Hydra you want in parallel, all connecting to the same DB without issues, as long as your DB can handle it?

Might be worthwhile to add a little something about this to the docs…

hackerman · September 23, 2018, 12:10pm

That’s it! I think we had a section about this way back but it seems to have been lost. A proper place for this would probably be “ORY Hydra in Production” under a new section “Scaling”.

By the way, we might add support for K8S storage at some point, or maybe some other datastore, depending on availability and popularity. But for now, SQL should be enough to handle most workloads!

avasenin · September 24, 2018, 8:51am

I saw that hydra has a support of storage plugins. Does it allow to have any storage implementation for customers who want to have different storage (or using multiples storages for different data)?

I saw plugin implementation for oracle database in ory github. Do you have any other examples of storage plugins?

paul · September 24, 2018, 2:08pm

Excellent! Yeah, that’s the place I was looking for it Want me to create an issue for adding this?

We happen to be using PostgreSQL already atm, so Hydra fits right in, but being able to plugin in other storage providers would be handy. For example if you’d happen to run on GCP and you operate globally, Google Cloud Spanner might be an interesting datastore for Hydra

hackerman · September 24, 2018, 10:48am

No docs on that unfortunately, but it’s possible to switch out the DBAL driver with a plugin without recompiling the whole project. Not sure with SQL dialect Cloud Spanner uses but if it’s PG or MySQL compatible it shouldn’t be a problem.

The OracleDB adapter is really old, for versions 0.9.x. Some things changed since then. My advice (always) is to not implement your own storage adapter. It will just lead to you not upgrading because you have to update the store first and write migrations, and it’s just a ton of work for something that is usually unjustifiable. Performance of PG and MySQL is stellar and you can handle immense traffic with that. If you’re on a scale of Google (3.5 billion searches per day) then yeah, this won’t work. But if you’re on a scale of regular companies then SQL is just enough.

avasenin · September 24, 2018, 11:10am

I saw your benchmarks on in-memory storage here: https://www.ory.sh/docs/guides/master/performance/1-hydra

It’s really cool to have performance number but it’s in-memory and show only computation overhead and doesn’t cover communication between sql storage <-> hydra. At least some numbers from your experience or existing success stories.

Just idea that It will be cool to have benchmarks with sql storage as well to show how much we can get without worrying about storage. If we have several million check token requests should we start worrying about custom storage?

hackerman · September 24, 2018, 11:13am

I think this is very well covered in the docs:

We do not include benchmarks against databases (e.g. MySQL or PostgreSQL) as the performance greatly differs between deployments (e.g. request latency, database configuration) and tweaking individual things may greatly improve performance. We believe, for that reason, that benchmark results for these database adapters are difficult to generalize and potentially deceiving. They are thus not included.

I still see the same issue, especially when you consider latency between two things (hydra, sql) as performance. Unless you have a way to get unbiased results I don’t think that will change quickly. Since the benchmarks themselves are open source and in the repository, you can quickly replicate them with a SQL store if you want.

I know of some companies that have about 1m requests per day that easily handle this with a small or medium RDS instance.

hackerman · September 24, 2018, 11:15am

But in general, we can not and do not replace your internal testing. You yourself have to see if the technology fits for your purposes. If the features are what you need but you worry about performance - do testings that replicate your environment. The whole product suite is already freely available, we can’t do everything for you

someone1 · October 1, 2018, 5:16pm

FYI - I wrote a Google Cloud Datastore plugin here: https://github.com/someone1/hydra-gcp - still tinkering with swapping the signing mechanism but you can compile a plugin from the /plugin subpackage

Benchmarks are pretty useless as aforementioned in this thread, but for the sake of relative performance I did run a few for comparison: https://github.com/someone1/hydra-gcp/tree/master/benchmarks

Some notes I did notice: hydra is very much CPU bound vs memory - more cores/speed will dramatically increase performance - especially due to bcrypt hashing.

rahulagrawal · September 2, 2019, 12:25pm

This information is very useful. May I know if I can use mysql cluster in a master master mode.
Also what would be the change required in the configuration, if any , to ensure that Hydra could write to any of the master instance.
For the primary keys that are generated in the application layer within Hydra, does the logic ensure that 2 instances of Hydra don’t ever produce the same primary key to avoid data integrity issues in master master mode of mysql.

hackerman · September 2, 2019, 1:36pm

Of course you can do that.

We use uuid, so collisions are incredibly unlikely:

While the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible.

Even if there was a collision, it would simply cause a 409 Conflict state which could easily be retried.

rahulagrawal · September 3, 2019, 4:42pm

Thanks a ton for quick reply and help to my question. It is highly appreciated of you.

As per the documentation page - https://www.ory.sh/docs/ecosystem/persistence

For Mysql it is mentioned -

If configuration key dsn (Data Source Name) is prefixed with mysql:// ,
then MySQL will be used as storage backend. An exemplary configuration would look like this:

DSN=mysql://user:password@tcp(host:123)/database?parseTime=true

Do we have a sample to guide what would the DSN name for using mysql to auto connect to the secondary when the primary fails.

In java world when we use java driver to talk to mysql, we have the following syntax that is used when connecting to mysql with failover
https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-config-failover.html
and
https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-usagenotes-j2ee-concepts-managing-load-balanced-connections.html

jdbc:mysql://[ *primary host* ][: *port* ],[ *secondary host 1* ][: *port* ][,[ *secondary host 2* ][: *port* ]]...[/[ *database* ]]» [? *propertyName1* = *propertyValue1* [& *propertyName2* = *propertyValue2* ]...]

jdbc:mysql:replication://[master host][:port],[slave host 1][:port][,[slave host 2][:port]]...[/[database]] »
[?propertyName1=propertyValue1[&propertyName2=propertyValue2]...]

hackerman · September 3, 2019, 8:11pm

In the cloud you’re using a SQL Load Balancer, e.g. HAProxy or any other suitable SQL Load Balancer. It’s not Hydra’s task to load balance the SQL backend!

rahulagrawal · September 4, 2019, 7:49am

Again thanks for the guidance. Agreed that Hydra’s task is not to do the HA of SQL backend.

I am trying to check the driver’s (used to talk to mysql) ability to do the the HA.
Reason - my understanding is that the java driver for mysql does that without the need for an external LB.

If you can point me to the driver being used in Hydra to talk to MySQL, I would look into it’s documentation to confirm what kind of support is provided in the driver, if any.
If the driver does not support it then we do need an external LB to do the same ( as you have mentioned it).

Hope this time I was able to express my thinking better than last time.

hackerman · September 4, 2019, 8:28am

I do not believe that the Go mysql driver supports HA config. It is common place today to use a proxy to solve that, as it moves away sensitive logic into a well tested, production ready, piece of software that is being used by all the major cloud platforms and usually all scalable environments.

The MySQL driver can be found here: https://github.com/go-sql-driver/mysql