HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgresql text column character Limit

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlcolumnlimittextcharacter

Problem

Im not even close to a DBA, so plain English please. I recently needed to look into a performance issue on a relatively small PG database, but yet is performing very slow. the server is running PGSql 9.0.22. Its largest table is about 30Mb, and the entire db is hardly 80Mb. However, i found that one of the tables has a text column (which i assume has no width limit) which probably stretches about 10 miles long. I copied the text from one column into Word, an i got 377 pages and a word count of 69814 (as in sixty eight thousand). Again, im not im not expert, but it just doesnt seem right to me. Can PG handle such large text columns? Is this considered a normal design/practice?

EDIT

Thank you both for your response. I would have to research its purpose and types of queries running against it. For the sack of argument, lets assume the column has a list of fruits, and is just a point of reference. Meaning, there's a query which just returns a True or Flase if Avocado is on that list, and then moves on to another action based on the True/Flase results. So theres a text "Fruits" column in the "Foods" table, the text in the column might look like this (with just spaces between each fruit):

Apple Avacado Mango Orange Cherry Grape Peach Srawberry ... and on listing about 68,000 fruits

Would it have been beneficial, or perhaps considered a better design, if each fruit was in its own field. In other words, have 68000 fields under the "Fruits" column, as opposed to one text column with 68000 words?

EDIT
I didn't mean a column for each fruit. I meant a row for each fruit under the fruit column

**Columns - Fruits, Vegetables** 

  Rows      -Apple     -Corn
            -Avocado   -Tomatoes
            -Mango
            -Orange
            -Cherry
            -Grape
            -Peach
            -Strawberry


And so on...a list of about 1.5 million

EDIT

In short, I mean having each piece of data in its own field as opposed to using a large text column. So for ex

Solution

You are right in thinking that items of a list should be stored as individual values of a column, i.e. in separate rows of that column, rather than as a single value (CSV string or anything like that) – at least if you expect to have queries against individual items of the list. Relational databases are designed to work that way, and storing multiple values of a column as a list (single value) is a no-no in such cases.

Regarding your data sample of fruits and vegetables, I am assuming that you are not thinking of splitting two lists of values exactly like that, storing each list's elements alongside each other on the same row without there actually being a relationship between them, just for the sake of normalisation of the way the vegetables and fruits are stored. That would be wrong.

I mean, if currently you have a row with a list of fruits and a list of vegetables such that each fruit and each vegetable are related to this particular row, i.e. like this:

FoodsID  Fruits                      Vegetables     other  columns
-------  --------------------------  -------------  -----  -------
100      Apple,Avocado,Mango,Orange  Corn,Tomatoes  ...    ...


and you want to expand each list into a row set while keeping the relationship of each fruit and each vegetable to the Foods row, you do not just add more rows and put the first item of one list together with the first item of the other list on the same row, then the second item of each list on another row and so on, like this:

FoodsID  Fruits   Vegetables  ...
-------  -------  ----------  -----
100      Apple    Corn        ...
100      Avocado  Tomatoes    ...
100      Mango                ...
100      Orange               ...


Again, that would be wrong. Fruit Apple may well be related to food item 100 as may vegetable Corn, but if these two items are not related to each other, there is no reason to store them on the same row.

Commonly, when you want to store a relationship between a subset and an item, you use a separate table. As you have two subsets of different kinds that are related to the same item, you just use two tables. So it would be:

-
table Foods:

FoodsID  other  columns
-------  -----  -------
100      ...    ...


-
table Fruits:

FoodsID  Fruit
-------  -------
100      Apple
100      Avocado
100      Mango
100      Orange


-
table Vegetables:

FoodsID  Vegetable
-------  ---------
100      Corn
100      Tomatoes


where FoodsID in table Foods would be the primary key of that table and same-named column in tables Fruits and Vegetable would serve as a reference (foreign key) to the corresponding row in Foods.

This way, when querying against individual items of either list as belonging to that particular row in Foods, you would probably use a join. Your typical filter conditions would be as simple as Vegetable = 'some vegetable' or Fruit IN ('some fruit', 'some other fruit') – not FindPos(Vegetable, 'some vegetable') > 0 or anything of the kind. The conditions would be simpler because each column would now contain one value per row. That will lend well to further optimisation of performance, because when you query often against a column value (without applying a function to it first, that is), you can make such queries faster by adding an index on the column – something that would be pointless for lists stored as a single value.

More can be said on this topic, which would, however, be beyond the scope of your question. I suggest you look up the terms I highlighted in bold for more information.

Code Snippets

FoodsID  Fruits                      Vegetables     other  columns
-------  --------------------------  -------------  -----  -------
100      Apple,Avocado,Mango,Orange  Corn,Tomatoes  ...    ...
FoodsID  Fruits   Vegetables  ...
-------  -------  ----------  -----
100      Apple    Corn        ...
100      Avocado  Tomatoes    ...
100      Mango                ...
100      Orange               ...
FoodsID  other  columns
-------  -----  -------
100      ...    ...
FoodsID  Fruit
-------  -------
100      Apple
100      Avocado
100      Mango
100      Orange
FoodsID  Vegetable
-------  ---------
100      Corn
100      Tomatoes

Context

StackExchange Database Administrators Q#124966, answer score: 3

Revisions (0)

No revisions yet.