patternpythonModerate
Parsing the lsblk output
Viewed 0 times
lsblktheparsingoutput
Problem
I am a Python beginner learning Python 3. I have written two small functions that parse the lsblk output and return Linux physical and logical disks. Here is the first function:
The second function:
The functions are in two different files in the same directory. I would appreciate a quick review on anything tha
from subprocess import run, PIPE
def physical_drives():
"""
Gets all physical drive names.
Gets all physical drive names on a Linux system,
parsing the lsblk utility output.
Parameters
----------
Returns
-------
list
A list of strings representing drive names.
"""
command = ['lsblk -d -o name -n']
output = run(command, shell=True, stdout=PIPE)
output_string = output.stdout.decode('utf-8')
output_string = output_string.strip()
results = output_string.split('\n')
return results
def main():
print(physical_drives())
if __name__ == '__main__':
main()The second function:
from subprocess import run, PIPE
def partitions(disk):
"""
Gets all partitions for a given physical disk.
Gets all partitions present on a physical disk
on a Linux system.
The function parses the lsblk utility output.
Parameters
----------
disk : string
A string containing a disk name such as 'sda'
Returns
-------
list
A list of strings representing partitions.
"""
command = ['lsblk -o name -n -s -l']
output = run(command, shell=True, stdout=PIPE)
output_string = output.stdout.decode('utf-8')
output_string = output_string.strip()
results = list()
results.extend(output_string.split('\n'))
results = [x for x in results if x != disk and disk in x]
return results
def main():
from disks import physical_drives
for drive in physical_drives():
print(drive)
parts = partitions(drive)
for partition in parts:
print('\t' + partition)
if __name__ == '__main__':
main()The functions are in two different files in the same directory. I would appreciate a quick review on anything tha
Solution
lsblk
The
But I don't see why you would want the
In the output,
Now, it's more apparent that the
To list the devices on
Subprocess execution
Whenever practical, I recommend avoiding the shell when executing subprocesses. The shell introduces a set of potential security vulnerabilities — for example, shenanigans with the
Alternative solution
I actually wouldn't bother with parsing the output of
The
-s option to lsblk was introduced to util-linux rather recently, in release 2.22. You may experience compatibility issues on slightly older GNU/Linux installations.But I don't see why you would want the
-s option at all — it just gives you an inverted device tree. For example, on my machine:$ lsblk -o name -n -s -l
sda1
sda
sda2
sda
sr0
vg-root
sda3
sda
vg-var
sda3
sda
vg-data
sda3
sdaIn the output,
sda appears multiple times. To understand the output, you need to drop the -l flag so that the list appears in tree form:$ lsblk -o name -n -s
sda1
└─sda
sda2
└─sda
sr0
vg-root
└─sda3
└─sda
vg-var
└─sda3
└─sda
vg-data
└─sda3
└─sdaNow, it's more apparent that the
-s option isn't helpful. If you drop it, then the output makes more sense:$ lsblk -o name -n
sda
├─sda1
├─sda2
└─sda3
├─vg-root
├─vg-var
└─vg-data
sr0
$ lsblk -o name -n -l
sda
sda1
sda2
sda3
vg-root
vg-var
vg-data
sr0To list the devices on
sda, it would be better to run lsblk -o name -n -l /dev/sda — that would immediately drop sr0 from consideration, for example. Note that LVM volumes (such as vg-root above) would still appear in the output. I don't think that doing a substring search (if x != disk and disk in x in your code) is a reliable filter. It could be fooled if there are more than 26 physical disks: the 27th disk would be named sdaa. It might also be fooled by exceptionally tricky naming of LVM volumes.Subprocess execution
Whenever practical, I recommend avoiding the shell when executing subprocesses. The shell introduces a set of potential security vulnerabilities — for example, shenanigans with the
PATH environment variable. Best practice would be to run the command with a specific executable and pre-parsed command-line options:run('/bin/lsblk -o name -n -s -l'.split(), stdout=PIPE)Alternative solution
I actually wouldn't bother with parsing the output of
lsblk at all. After all, lsblk is just a way to report the contents of the sysfs filesystem. You would be better off inspecting /sys directly.from glob import glob
from os.path import basename, dirname
def physical_drives():
drive_glob = '/sys/block/*/device'
return [basename(dirname(d)) for d in glob(drive_glob)]
def partitions(disk):
if disk.startswith('.') or '/' in disk:
raise ValueError('Invalid disk name {0}'.format(disk))
partition_glob = '/sys/block/{0}/*/start'.format(disk)
return [basename(dirname(p)) for p in glob(partition_glob)]Code Snippets
$ lsblk -o name -n -s -l
sda1
sda
sda2
sda
sr0
vg-root
sda3
sda
vg-var
sda3
sda
vg-data
sda3
sda$ lsblk -o name -n -s
sda1
└─sda
sda2
└─sda
sr0
vg-root
└─sda3
└─sda
vg-var
└─sda3
└─sda
vg-data
└─sda3
└─sda$ lsblk -o name -n
sda
├─sda1
├─sda2
└─sda3
├─vg-root
├─vg-var
└─vg-data
sr0
$ lsblk -o name -n -l
sda
sda1
sda2
sda3
vg-root
vg-var
vg-data
sr0run('/bin/lsblk -o name -n -s -l'.split(), stdout=PIPE)from glob import glob
from os.path import basename, dirname
def physical_drives():
drive_glob = '/sys/block/*/device'
return [basename(dirname(d)) for d in glob(drive_glob)]
def partitions(disk):
if disk.startswith('.') or '/' in disk:
raise ValueError('Invalid disk name {0}'.format(disk))
partition_glob = '/sys/block/{0}/*/start'.format(disk)
return [basename(dirname(p)) for p in glob(partition_glob)]Context
StackExchange Code Review Q#152486, answer score: 11
Revisions (0)
No revisions yet.