patternMinor
Extract filename without extension from the absolute location
Viewed 0 times
withoutthefilenameextensionabsoluteextractfromlocation
Problem
I'm trying to get a filename contained in the value of a specific column of my table. My table looks like this:
I need to extract the filename (in the above
absolutel_path
\\Path\filename.extensionI need to extract the filename (in the above
filename) from the absolute_path (\\Path\filename.extension). Which function should I use to get my filename (substring) out?Solution
Edit:
Even though my first solution answered the question as asked, I saw @DavidBoho 's answer and he made several good points. He suggested that if the filename is
Given the table and data as follows:
populate:
My original solution:
gives the result:
After reading @DavidBoho's post, he used the
Result:
which is also the correct answer!
While I was grasping for a solution, I became interested in regular expressions as a means of solving this problem. Even though I (we) was/were able to solve this using "traditional" SQL, it became clear to me that regexes are extremely powerful and even though SQL is now Turing complete, it could rapidly become extremely convoluted for relatively simple string manipulation problems, so I decided to investigate.
I found two regex solutions - in fairness, I can't claim to have done this myself, the solutions are as a result of a question I asked on StackOverflow. So, the regex solutions are as follows:
The best one is this one -
There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested
%%CODEBLOCK_7%%
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '') AS filename FROM with_filename
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '\1') AS filename FROM with_filename;
There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested
%%CODEBLOCK_7%%
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%
Even though my first solution answered the question as asked, I saw @DavidBoho 's answer and he made several good points. He suggested that if the filename is
my_file.tar.gz then the return value should be my_file.tar and also that my solution would fail in the event that the file had no extension at all. All of the code here is available on this fiddle.Given the table and data as follows:
CREATE TABLE with_filename
(
file_id INTEGER,
file_name VARCHAR (256)
);populate:
INSERT INTO with_filename
VALUES
(1, '/users/mcm1/ualaoip2/vmm/file1.pdf'),
(2, '/users/mcm1/ualaoip2/vmm/file2.py'),
(3, '/users/mcm1/ualaoip2/vmm/file3.pdf'),
(4, '/users/mcm1/ualaoip2/vmm/file4.c'),
(5, '/users/mcm1/ualaoip2/vmm/file5.java'),
(6, '/users/mcm1/ualaoip2/vmm/file6.class'),
(7, '/users/mcm1/ualaoip2/vmm/file7'),
(8, '/users/mcm1/ualaoip2/vmm/file8.tar.gz'),
(9, '/users/mcm1/my_prog.cpp');My original solution:
SELECT LEFT(
RIGHT(file_name, POSITION('/' IN REVERSE(file_name)) - 1),
POSITION('.' IN
RIGHT(file_name, POSITION('/' IN REVERSE(file_name)) - 1)) - 1
) AS my_file
FROM with_filename;gives the result:
my_file
file1
file2
file3
file4
file5
file6
file -- << should be file7
file8 -- << should be file8.tar
my_progAfter reading @DavidBoho's post, he used the
SPLIT_PART function to resolve the problems with files 7 & 8 - see the fiddle. I decided to look again at my own SQL and I came up with this (perhaps more traditional?):SELECT
REPLACE(SUBSTRING(file_name, (LENGTH(file_name) + 2) - POSITION('/' IN REVERSE(file_name))),
RIGHT(file_name, POSITION('.' IN LEFT(REVERSE(file_name), POSITION('/' IN REVERSE(file_name)) - 1))),
'') AS the_files
FROM with_filenameResult:
the_files
file1
file2
file3
file4
file5
file6
file7
file8.tar
my_progwhich is also the correct answer!
While I was grasping for a solution, I became interested in regular expressions as a means of solving this problem. Even though I (we) was/were able to solve this using "traditional" SQL, it became clear to me that regexes are extremely powerful and even though SQL is now Turing complete, it could rapidly become extremely convoluted for relatively simple string manipulation problems, so I decided to investigate.
I found two regex solutions - in fairness, I can't claim to have done this myself, the solutions are as a result of a question I asked on StackOverflow. So, the regex solutions are as follows:
The best one is this one -
SELECT
file_name,
REGEXP_REPLACE(file_name, '^.*/([^/]*?)(\.[^/.]+)?
There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested REGEXP_REPLACEs
SELECT
file_name,
REGEXP_REPLACE(REGEXP_REPLACE(file_name, '^.*/(.*)
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
SELECT fn,
LEFT(fn, POSITION('.' IN fn) - 1) AS lef
FROM with_filename w,
UNNEST(STRING_TO_ARRAY(w.file_name, '/')) AS fn
GROUP BY fn
HAVING COUNT(fn) = 1
ORDER BY lef;
which gives the result:
fn lef
file7 file -- << should be file7
file1.pdf file1
file2.py file2
file3.pdf file3
file4.c file4
file5.java file5
file6.class file6
file8.tar.gz file8 -- << should be file8.tar
my_prog.cpp my_prog
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
SELECT
fn,
COUNT(fn)
FROM
(
SELECT REGEXP_SPLIT_TO_TABLE(w.file_name, '/') AS fn
FROM with_filename w
) AS sq
GROUP BY fn
HAVING COUNT(fn) = 1
ORDER BY fn
Result:
fn count
file1.pdf 1
file2.py 1
file3.pdf 1
file4.c 1
file5.java 1
file6.class 1
file7 1
file8.tar.gz 1
my_prog.cpp 1
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
CREATE TABLE with_filename
(
file_id INTEGER,
file_name VARCHAR (256)
);
INSERT INTO with_filename
VALUES
(1, '/users/mcm1/ualaoip2/vmm/file1.pdf'),
(2, '/users/mcm1/ualaoip2/vmm/file2.py'),
(3, '/users/mcm1/ualaoip2/vmm/file3.pdf'),
(4, '/users/mcm1/ualaoip2/vmm/file4.c'),
(5, '/users/mcm1/ualaoip2/vmm/file5.java'),
(6, '/users/mcm1/ualaoip2/vmm/file6.class'),
(7, '/users/mcm1/ualaoip2/vmm/file7'),
(8, '/users/mcm1/ualaoip2/vmm/file8.tar.gz'),
(9, '/users/mcm1/my_prog.cpp');
, '\1') AS filename
FROM with_filename;
There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested REGEXP_REPLACEs
%%CODEBLOCK_7%%
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '\1'), '\.[^.]+
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '\1') AS filename
FROM with_filename;There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested
REGEXP_REPLACEs%%CODEBLOCK_7%%
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '') AS filename FROM with_filename
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%, '\1') AS filename FROM with_filename;
There is a second one, but IMHO (and that of the original author) it's a bit of a hack - it involves two nested
REGEXP_REPLACEs%%CODEBLOCK_7%%
Finally, there may be a solution possible using the UNNEST and the STRING_TO_ARRAY functions together - I came up with this code:
%%CODEBLOCK_8%%
which gives the result:
%%CODEBLOCK_9%%
I tried lots of different permutations with this, but couldn't get it to work. Would be grateful for any input! :-)
Another interesting function is REGEXP_SPLIT_TO_TABLE.
%%CODEBLOCK_10%%
Result:
%%CODEBLOCK_11%%
Again, this might be worth pursuing - didn't have time.
Example DML/DDL
%%CODEBLOCK_12%%
Code Snippets
CREATE TABLE with_filename
(
file_id INTEGER,
file_name VARCHAR (256)
);INSERT INTO with_filename
VALUES
(1, '/users/mcm1/ualaoip2/vmm/file1.pdf'),
(2, '/users/mcm1/ualaoip2/vmm/file2.py'),
(3, '/users/mcm1/ualaoip2/vmm/file3.pdf'),
(4, '/users/mcm1/ualaoip2/vmm/file4.c'),
(5, '/users/mcm1/ualaoip2/vmm/file5.java'),
(6, '/users/mcm1/ualaoip2/vmm/file6.class'),
(7, '/users/mcm1/ualaoip2/vmm/file7'),
(8, '/users/mcm1/ualaoip2/vmm/file8.tar.gz'),
(9, '/users/mcm1/my_prog.cpp');SELECT LEFT(
RIGHT(file_name, POSITION('/' IN REVERSE(file_name)) - 1),
POSITION('.' IN
RIGHT(file_name, POSITION('/' IN REVERSE(file_name)) - 1)) - 1
) AS my_file
FROM with_filename;my_file
file1
file2
file3
file4
file5
file6
file -- << should be file7
file8 -- << should be file8.tar
my_progSELECT
REPLACE(SUBSTRING(file_name, (LENGTH(file_name) + 2) - POSITION('/' IN REVERSE(file_name))),
RIGHT(file_name, POSITION('.' IN LEFT(REVERSE(file_name), POSITION('/' IN REVERSE(file_name)) - 1))),
'') AS the_files
FROM with_filenameContext
StackExchange Database Administrators Q#190982, answer score: 8
Revisions (0)
No revisions yet.