HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Can we execute an optimal plan instead of a generic one in the first execution of a PL/pgSQL function?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
genericcantheoptimalfunctioninsteadplanonefirstpgsql

Problem

I have a really busy function that I need to optimize the best way I can. This function is just a nested select statement that is requested several times a sec by a legacy application.

The indexes are in place but I've noticed that it is only used after the first execution of the function. I think that the problem is that Postgres creates a general execution plan because of a parameter that it is highly exclusive in the majority of the cases, but it can be not as good sometimes.

When I test with EXPLAIN ANALYZE after the first execution, the query runs really fast but the app sessions call the function only once and then are terminated. I need that the first execution use the actual optimized plan. Anyone can help?

We tried messing around with the connector driver that manage the connection pooling to issue a DISCARD TEMP instead of DISCARD ALL, so it could keep the cached plan of the sessions and the performance went through the roof, but I don't want to do that in a production environment.

We are on a Postgres 9.4 running on CentOS 6. I've tried running as a SQL function but it didn't help, it was actually faster as a plpgsql func. Here is the function code:

```
CREATE OR REPLACE FUNCTION public.ap_keepalive_geteqpid_veiid(
IN tcbserie bigint,
IN protocolo integer)
RETURNS TABLE(eqpid integer, veiid integer, tcbid integer, veiplaca character varying, veiproprietariocliid integer, tcbtppid integer, tcbversao character, veirpmparametro double precision, tcbconfiguracao bigint, tcbevtconfig integer, veibitsalertas integer, sluid integer, harid integer) AS
$BODY$
BEGIN
RETURN QUERY
SELECT teqp.eqpID,
teqp.eqpveiID AS veiID,
tcb.tcbID,
tvei.veiPlaca,
tvei.veiProprietariocliID,
tcb.tcbtppID,
tcb.tcbVersao,
tvei.veiRPMParametro,
COALESCE(COALESCE(NULLIF(tcb.tcbConfiguracao, 0), tcc.clcConfiguracaoBitsVeic), 0) AS tcbConfiguracao,

Solution

I cleaned up and simplified a few minor details. That shouldn't change much for performance, yet. But the added SET join_collapse_limit = 1 might:

CREATE OR REPLACE FUNCTION public.ap_keepalive_geteqpid_veiid(tcbserie bigint, protocolo int)
  RETURNS TABLE(eqpid int, veiid int, tcbid int
              , veiplaca varchar, veiproprietariocliid int, tcbtppid int, tcbversao character, veirpmparametro double precision
              , tcbconfiguracao bigint, tcbevtconfig int, veibitsalertas int, sluid int, harid int) AS
$func$
BEGIN
   RETURN QUERY
   SELECT eqp.eqpID
        , eqp.eqpveiID AS veiID
        , cb.tcbID
        , vei.veiPlaca
        , vei.veiProprietariocliID
        , cb.tcbtppID
        , cb.tcbVersao
        , vei.veiRPMParametro
        , CASE WHEN cb.tcbConfiguracao = 0 THEN COALESCE(cc.clcConfiguracaoBitsVeic, 0)
               ELSE cb.tcbConfiguracao END -- AS tcbConfiguracao
        , COALESCE(cb.tcbevtConfig, 0)     -- AS tcbevtConfig
        , COALESCE(vei.veiBitsAlertas, 0)  -- AS veiBitsAlertas
        , COALESCE(vei.veisluID, 0)        -- AS sluID
        , COALESCE(cb.tcbharID, 0)         -- AS harID
   FROM   TabEquipamento        eqp
   JOIN   TabVeiculos           vei ON vei.veiID = eqp.eqpveiID
   JOIN   TabComputadorBordo    cb  ON cb.tcbID  = eqp.eqptcbID
   JOIN   TabPacoteProduto      pp  ON pp.tppID  = eqp.eqptppID
   LEFT   JOIN TabCliente       cli ON cli.cliid = vei.veiProprietariocliID
   LEFT   JOIN TabClienteConfig cc  ON cc.clcCliID = cli.cliID
   WHERE  eqp.eqpAtivo = 1
   AND    vei.veiBloqueioSinal = 0
   AND    cb.tcbserie = $1
   AND    pp.tppIDProtocolo = $2;
END
$func$  LANGUAGE plpgsql VOLATILE STRICT COST 10000 ROWS 1
        SET join_collapse_limit = 1; -- see below!


Notes

COALESCE can take multiple parameters, no need to nest:

COALESCE(NULLIF(cb.tcbConfiguracao, 0), cc.clcConfiguracaoBitsVeic, 0) AS tcbConfiguracao


The CASE expression I ended up using above should be a bit faster, yet.

character is suspicious as data type. It's the same as char(1), I assume you are aware of that.

Inside PL/pgSQL, column aliases that are not referenced in the same query are just for documentation. Only the names in the RETURNS TABLE clause are visible outside the function.

COST 100 is the default for a user-defined function and probably way off for your case. 10000 is probably a better estimate, but unless you nest this function in an outer query, this has hardly any effect.

I removed the parameter defaults (as discussed) and made the function STRICT, since the query would return nothing for any NULL input anyway.

I simplified your table aliases and formatted some more to make it easier to read and play with. That last bit is largely a matter of taste and style.

Answer to question

As to your actual question: There are no (direct) hints for the query planner (optimizer) in PostgreSQL like there are in other RDBMS. Details in the Postgres Wiki on "OptimizerHintsDiscussion".

PL/pgSQL works with prepared statements internally. It will re-plan queries inside the function body with the given input parameters for the first couple of calls per session. Only if it finds that specific plans don't perform better than a generic plan it will then switch to a generic plan and keep that, which saves some overhead.

Details:

  • PostgreSQL Stored Procedure Performance



However, there are a couple of settings you can adjust. In particular, if you know the best query plan, you can force Postgres to take the order of joins in the FROM clause as given and not try to reorder (which can get expensive for many tables - and you have 6 of them), by setting the join_collapse_limit. This will reduce the cost of planning the query. If done right, it will make the first couple of calls faster. If you mess it up, performance will suffer, of course.

You could put a SET LOCAL as first command:

...
BEGIN
   SET LOCAL join_collapse_limit = 1;
   RETURN QUERY ...


Better yet, declare it as an attribute of the function itself like I did above. The effect of SET LOCAL in the function body would last till the end of the transaction, but, per documentation:


The SET clause causes the specified configuration parameter to be set
to the specified value when the function is entered, and then restored
to its prior value when the function exits.

Obviously, you need to get the sequence of joins in your FROM clause right yourself. And it has to be good for all possible combinations of parameters. Postgres will not try to optimize. (The STRICT modifier simplifies a bit, because NULL values are excluded now.)

Put tables with the most selective predicates first.

WARNING: Be aware that such optimizations may turn from helpful to obstructive after upgrades to your Postgres version or any major changes in your DB.

Related answer with more details:

  • Slow fulltext search due to wildly inaccurate row estimates



Aside

I assume you are aware that ther

Code Snippets

CREATE OR REPLACE FUNCTION public.ap_keepalive_geteqpid_veiid(tcbserie bigint, protocolo int)
  RETURNS TABLE(eqpid int, veiid int, tcbid int
              , veiplaca varchar, veiproprietariocliid int, tcbtppid int, tcbversao character, veirpmparametro double precision
              , tcbconfiguracao bigint, tcbevtconfig int, veibitsalertas int, sluid int, harid int) AS
$func$
BEGIN
   RETURN QUERY
   SELECT eqp.eqpID
        , eqp.eqpveiID AS veiID
        , cb.tcbID
        , vei.veiPlaca
        , vei.veiProprietariocliID
        , cb.tcbtppID
        , cb.tcbVersao
        , vei.veiRPMParametro
        , CASE WHEN cb.tcbConfiguracao = 0 THEN COALESCE(cc.clcConfiguracaoBitsVeic, 0)
               ELSE cb.tcbConfiguracao END -- AS tcbConfiguracao
        , COALESCE(cb.tcbevtConfig, 0)     -- AS tcbevtConfig
        , COALESCE(vei.veiBitsAlertas, 0)  -- AS veiBitsAlertas
        , COALESCE(vei.veisluID, 0)        -- AS sluID
        , COALESCE(cb.tcbharID, 0)         -- AS harID
   FROM   TabEquipamento        eqp
   JOIN   TabVeiculos           vei ON vei.veiID = eqp.eqpveiID
   JOIN   TabComputadorBordo    cb  ON cb.tcbID  = eqp.eqptcbID
   JOIN   TabPacoteProduto      pp  ON pp.tppID  = eqp.eqptppID
   LEFT   JOIN TabCliente       cli ON cli.cliid = vei.veiProprietariocliID
   LEFT   JOIN TabClienteConfig cc  ON cc.clcCliID = cli.cliID
   WHERE  eqp.eqpAtivo = 1
   AND    vei.veiBloqueioSinal = 0
   AND    cb.tcbserie = $1
   AND    pp.tppIDProtocolo = $2;
END
$func$  LANGUAGE plpgsql VOLATILE STRICT COST 10000 ROWS 1
        SET join_collapse_limit = 1; -- see below!
COALESCE(NULLIF(cb.tcbConfiguracao, 0), cc.clcConfiguracaoBitsVeic, 0) AS tcbConfiguracao
...
BEGIN
   SET LOCAL join_collapse_limit = 1;
   RETURN QUERY ...

Context

StackExchange Database Administrators Q#123618, answer score: 2

Revisions (0)

No revisions yet.