Your Question
I am building a multitenant app with postgres. Tenants are split by posgtres schema. On every http request, I determine the tenant's schema using domain or header information. I create a new *gorm.DB using Session() in the beginning of each request and pass it down for further querying. My question is how do i set schema on this object so that my further queries will execute on that schema during the lifecycle of a specific http request?
Few approaches I tried,
- I tried
set search_path to schemaon each request and it did not work since Go takes any available connection from the pool for each query - I tried using NamingStrategy.TablePrefix on the *gorm.DB like below on each http request. But looks like this configuration is only processed only during initialization and not dynamically. Please correct me if I am wrong about this.
db.Config.NamingStrategy = schema.NamingStrategy{ TablePrefix: pgschema + ".", }
Expected answer
What are the possible solutions to this situation?
Comment From: jinzhu
Use TableName or Scope with Table
Comment From: NeverBehave
Use
TableNameorScopewithTable
Really appreciate if there could be any sample code, etc.
If using table name, I am not sure how to decide the full table name when queries uses struct e.g. db.Where(&Instance{Id: 123}). Under such circumstances are we responsible for setting the correct name?
Comment From: ghost
Hi guys could you re-open this?
I have the same issue and this explanation:
Use TableName or Scope with Table
Doesn't helped me a lot...
So in my case I have the following situation
- I have a struct called
Category - Also many tenants
And my tenant is setted dynamically, as supposed I need to use Scopes to change the table name on-the-fly. Since the breaking change
TableNamewill not allow dynamic table name anymore, the result of TableName will be cached for future
So how can I achieve this? I tryed:
func Tenant(tenant string) func(*gorm.DB) *gorm.DB {
return func(db *gorm.DB) *gorm.DB {
// What I suppose to do here?
return db.Table(tenant + "_" + originalTableName???)
}
}
But I wasn't able to get the original table name ...
Comment From: NeverBehave
@mirusky-enube That's basically what I have so far. Really confusing.
In my understanding, what author treating this so far is like table prefixing (and you can't do that either, for now I guess)... since schema is a unique feature comes from postgres, there isn't a specific optimization for that. Not even a proposal to make schema more useful as well from my research in issues here.
Comment From: mirusky
@jinzhu could you reopen pls?
@NeverBehave I found a way, but looks like a workaround, based on this Issue
We could use something like:
func Tenant(tenant string, model interface {}) func(*gorm.DB) *gorm.DB {
return func(db *gorm.DB) *gorm.DB {
stmt := &gorm.Statement{DB: db}
stmt.Parse(model)
return db.Table(tenant + "." + stmt.Schema.Table)
}
}
It will work but you need to pass the Struct twice (in the scope and in the gorm method ). I didn't tested with relationships like has one, has many, m2m and belongs to ...
But I believe it won't work with relationships.
Comment From: NeverBehave
@mirusky Oh thanks for this partial work around, but how would you use this Scope ?
From what I tested, you will have to specify the table first, but at that moment I don't think the table name is fixed. E.g.
db.Find().Table("xxx") // not working, table won't change
db.Table("xxx").Find()
I am not sure if I have done something wrong but I may come back and update this section later.
Comment From: mirusky
@NeverBehave I use in this way:
db.Scope(Tenant("tenant-1",User{})).Find(&User{ID:1})
As I said we need to pass the Struct twice, so I don't have sure if it will work with relationships.
Comment From: NeverBehave
@mirusky Thanks for the explanation! I miss the first parameter at first glance ouch... Yeah this is less flexible, and I don't think it works w/ Raw, etc. I can't tell how ORM here can seamlessly adopt the schema for various circustances as well.
What about writing a DBResolver? I think it would be the best option so far by controlling the connection schema state and the rest of the SQL won't need to be changed. Just using the same connection within same session/context.
Update: I found that it seems GORM could extract table name from Raw SQL. I haven't tested it yet but I think even though it works, same limitation applied. I am not sure for the DBResolver Policy, it is possible to create/execute SQL for specific schema.
Comment From: mirusky
I don't know, I was reading docs again and i found Clauses and StatementModifier. Maybe is possible to use it to change SQL stmt and adds a schema before table name...
But I need to learn how it works ...
Comment From: NeverBehave
The approach to change clauses/statement would be problematic as I think there would be lots of corner cases. I would prefer e.g. set search_path and keep connection state for the same context/session.
It would be transparent for the application: none of the existing SQL needs to be changed as long as we keep the same state
Comment From: NeverBehave
@mirusky I am actually working on an approach: a self managed pool of *gorm.DB
Basically I wrote a simple DBManager that handles different tenants, so this manager will be shared dynamically and create connections on demand. It avoids context switch (database will have to flush if switch happened within connection) and potential corner cases, if existing application has complex queries.
The demo below does not limit the number of tenant/connections, and I am not sure how to (correctly) manually control gorm's connection pool, related #3145
import (
"fmt"
"os"
"sync"
"time"
"gorm.io/driver/postgres"
"gorm.io/gorm"
gocache "zgo.at/zcache"
)
var cacheHandler *gocache.Cache
type StoreDB struct {
CreationMutex sync.Mutex
cache *gocache.Cache
}
type Context struct {
Schema string
}
func (s *StoreDB) CreateDBHandler(ctx *Context) *gorm.DB {
// Maybe we could use a better mechanism to do this.
s.CreationMutex.Lock()
defer s.CreationMutex.Unlock()
// Double check before moving to Creation precedures
if db, found := s.cache.Touch(ctx.Schema, gocache.DefaultExpiration); found {
return db.(*gorm.DB)
}
dsn := fmt.Sprintf("host=%s user=%s password=%s dbname=%s sslmode=disable search_path=%s", getEnv("DB_HOST", "db"), getEnv("DB_USER", "postgres"), getEnv("DB_PASSWORD", "example"), getEnv("DB_NAME", "some-database"), ctx.Schema)
db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{})
if err != nil {
panic(err)
}
cacheHandler.Set(ctx.Schema, db, gocache.DefaultExpiration)
return db
}
func (s *StoreDB) GetDBHandler(ctx *Context) *gorm.DB {
db, found := s.cache.Touch(ctx.Schema, gocache.DefaultExpiration)
if found {
return db.(*gorm.DB)
}
return s.CreateDBHandler(ctx)
}
func initManager() *StoreDB {
cacheHandler = gocache.New(10*time.Minute, 10*time.Minute)
cacheHandler.OnEvicted(func(s string, i interface{}) {
// https://github.com/go-gorm/gorm/issues/3145
sql, err := i.(*gorm.DB).DB()
if err != nil {
panic(err)
}
sql.Close()
})
return &StoreDB{
cache: cacheHandler,
}
}
Example usage
dbMap := initManager()
gdb := dbMap.GetDBHandler(&Context{
Schema: "some_schema",
})
// Now, gdb is the normal *gorm.DB with given schema
The performance is basically the same as directly Open and do queries, and for application, they only need to replace their db handler. I guess it may be an option.
@jinzhu Any suggestion? Thanks!
Comment From: cesc1802
I also need to handle this problem. now, I writing an interface to get db instance by schema name. if someone has better solutions. Please help me
Comment From: mirusky
A different approach is create a transaction:
db.Transaction(func(tx *gorm.DB) error {
// do some database operations in the transaction (use 'tx' from this point, not 'db')
tx.Exec("SET search_path TO tenant_a;")
// At this point this whole transaction is inside the defined search_path aka tenant
if err := tx.Create(&Animal{Name: "Lion"}).Error; err != nil {
return err
}
// prevent this connection to be reutilized with wrong tenant
tx.Exec("SET search_path TO DEFAULT;")
// return nil will commit the whole transaction
return nil
})
Is important to note that the search_path is changed twice, at the beginning and at the end. Because of PostgreSQL Documentation:
If SET (or equivalently SET SESSION) is issued within a transaction that is later aborted, the effects of the SET command disappear when the transaction is rolled back. Once the surrounding transaction is committed, the effects will persist until the end of the session, unless overridden by another SET.
In other words, we need to reverse the change of the first set or it will persist until this session is discarded.
Comment From: cesc1802
A different approach is create a transaction:
```go db.Transaction(func(tx *gorm.DB) error { // do some database operations in the transaction (use 'tx' from this point, not 'db') tx.Exec("SET search_path TO tenant_a;")
// At this point this whole transaction is inside the defined search_path aka tenant if err := tx.Create(&Animal{Name: "Lion"}).Error; err != nil { return err }
// prevent this connection to be reutilized with wrong tenant tx.Exec("SET search_path TO DEFAULT;")
// return nil will commit the whole transaction return nil }) ```
Is important to note that the
search_pathis changed twice, at the beginning and at the end. Because of PostgreSQL Documentation:If SET (or equivalently SET SESSION) is issued within a transaction that is later aborted, the effects of the SET command disappear when the transaction is rolled back. Once the surrounding transaction is committed, the effects will persist until the end of the session, unless overridden by another SET.
In other words, we need to reverse the change of the first set or it will persist until this session is discarded.
so have work with relations?
Comment From: ghost
@cesc1802 if all the models is in the same schema, probably It will work with relationships.
Comment From: NeverBehave
@mirusky ‘s approach has heavy performance penalty. As mentioned in gorm, transaction will add at least 30% of the overhead. And it is not only write action. All read/query will also need to be wrapped and suffered the performance cost, or your data will be mixed.
And tested shown that ctx on the fly switch/flush will also cause extra performance issue. Basically operation time takes 2x compared to direct connection
@cesc1802 That’s basically the solution I have done so far. Create instances on demand and reuse it. The performance loses could be ignored under this mode. However, watch out for max connections.
Comment From: mirusky
@NeverBehave by default all the operations is inside a transaction, so if you don't disabled it explicitly there is no performance cost. As shown in the docs:
GORM perform write (create/update/delete) operations run inside a transaction
So since there is no default support to postgres schema there 2 workarounds:
- Create a gorm instance per schema
- Explicitly use transactions to change the schema on-the-fly
Comment From: NeverBehave
@mirusky I am saying the on-the-fly approach will make all Query (Read Action) have to be wrapped inside the transaction, which is not necessary in most of the use case. All write(create/delete/etc.) action by default could suffered this performance cost which is okay, but the on-the-fly approach will still add an extra overhead -- prepared stmt cache invalidated. See https://dba.stackexchange.com/a/166511
Changing the search_path causes Postgres to start from scratch with prepared statements. The manual on PREPARE:
Also, if the value of search_path changes from one use to the next, the statement will be re-parsed using the new search_path. (This latter behavior is new as of PostgreSQL 9.3.)
It is a serious downgrade(double downgrade for query!) in normal use cases, especially read/query are more then writes/update, and hard to get around if not treated carefully.
Comment From: bartventer
TL;DR solution: I published a package which provides drivers to support multi-tenancy and also provides some handy HTTP middleware: https://github.com/bartventer/gorm-multitenancy