patternpythonMinor
Storing data in the pixels of an image
Viewed 0 times
imagethedatapixelsstoring
Problem
I've recently redone something I wrote around a year ago since I'd done it wrong (it required the original image to decode an imagine, which can be avoided), so I'm wondering if I could either get feedback on the style of writing, or on which bits can be broken, since I've tried to make it fairly robust. The idea is to store stuff in images by modifying the pixels, so you could send files over image hosts and stuff like that.
First off, I tried following that 72/80 line thing for the first time, and had to do some not so nice cuts/renames, especially with the list comprehension bits, so that's why a lot of stuff is split over multiple lines.
The one bit I'm not so sure about is the Imgur authentication. Since user accounts get a 4MB upload limit and anonymous accounts get a 1MB upload limit (and this obviously requires all images to stay as PNG), I've kind of forced the code to use an account if upload is enabled. If it is not the users account (there is an option to authenticate your own account), it'll just upload to a default account I set up, where the
So, as to the actual code, if you input an image to write over, it'll calculate the minimum bits it can use to store the required data. The first red value will be used as a header, then the data will be split and replace the last few bits of each colour in the image. This goes up to 7, then after that at 8, there is no point in keeping the original resolution since the original image would exist no more, so it just reverts to as if you never input an image.
If you don't input an image, you can choose a ratio since the resolution can be anything, and it defaults to
I mentioned a header before and I go into more detail
First off, I tried following that 72/80 line thing for the first time, and had to do some not so nice cuts/renames, especially with the list comprehension bits, so that's why a lot of stuff is split over multiple lines.
The one bit I'm not so sure about is the Imgur authentication. Since user accounts get a 4MB upload limit and anonymous accounts get a 1MB upload limit (and this obviously requires all images to stay as PNG), I've kind of forced the code to use an account if upload is enabled. If it is not the users account (there is an option to authenticate your own account), it'll just upload to a default account I set up, where the
pyimgur.Imgur client is un-pickled from a string. If you successfully authenticate, it'll use your account instead until you use imgur_auth = False, where it'll reset it.So, as to the actual code, if you input an image to write over, it'll calculate the minimum bits it can use to store the required data. The first red value will be used as a header, then the data will be split and replace the last few bits of each colour in the image. This goes up to 7, then after that at 8, there is no point in keeping the original resolution since the original image would exist no more, so it just reverts to as if you never input an image.
If you don't input an image, you can choose a ratio since the resolution can be anything, and it defaults to
'16:9'. If you input an image and it reverts to this method, it calculates the ratio from the resolution of the image.I mentioned a header before and I go into more detail
Solution
In short, your class is too big. Try splitting it into two or three classes,
Making
Line's
First off, I tried following that 72/80 line thing for the first time
It's really good that you done that.
For your strings say
You can use operator-less raw string concatenation.
Also if you don't like on delimiter lining you can use an alternate.
I think that looks nicer. And now you don't have to care about concatenating them.
As for your cuts/renames.
I personally would say that that good variable names are better than 81 characters in a line.
However normally you can do something to achieve both.
First, there is normally a problem if you are indenting more than say 5 times.
You could probably re-factor the code into two or more functions.
I prefer the third way, however some prefer the forth way.
So if you see the third way and not the forth way in my review then either is good.
As for very long comprehensions I think splitting them into multiple lines is very good.
And you do that for the most part.
The above is what I find.
I know that I use the former more than the latter,
but I find the latter easier to read, and if
then you could easily merge them together.
General
-
While I can see
It's like there is a number on a display past some glass.
You can see it, and you know what it is.
But when you try to use your pen to overwrite it you just get the writing on the glass.
Now you can't see the number on display,
as your new writing looks like it's on display.
And everyone behind you sees yours as if it were the writing on display.
In short, you're not behind any glass. And so you don't need
-
Try to limit the size of
You have two imports in a
Change this so there are two different
and do in if if you need to limit the second
-
I hate
It seems stupid, you can just put the letters in the dict.
This both stops you slaving away when changing the name of the dict.
And make it so that I know what the dict contains straight away.
-
Using on delimiter line calls impairs your programs readability.
It's easier to just read on the left side of the screen,
then to jump from left to right.
-
Your
However if you can't change your program to not use strings you would have to use
and that can get ugly.
Or you could use the dictionary instead.
-
URL's are normally very alike to Unix paths.
Rather than re-inventing the wheel, you can use
This will allow for possibly safer path changes as your current path methods aren't too safe.
-
I think mixing opening via URL and system is a very bad Idea.
I would rather as a developer have the option for one or the other,
I don't want to download some file, when I specified a file on the system.
-
Your
To be honest I kinda don't want to read something that large.
It's ~300 lines long, and has indents up to and including 6.
It should really be split up into more functions no matter how you look at it.
-
I think you really need to remove the Imgur implementation in this program.
All it does is add complexity to your program.
First I think that your class should have an 'encode' - put data in the image,
and a 'decode' - take information out of it.
Then you can pass a PIL image to it, and the functions do there magic.
No confusion over how to handle Imgur, no confusion over reading or writing and no compression.
Your aim should not be to throw as many features into this as possible.
(Just to note I'm not saying don't write this with Imgur support, that's a really cool idea.)
I just think Imgur should be designed afterwards.
And defiantly not implemented in the same class, let alone the same f
Steganography, Imgur and URLOpen.Making
Steganography it's own class is a must for this program.Line's
First off, I tried following that 72/80 line thing for the first time
It's really good that you done that.
For your strings say
encoded_client.You can use operator-less raw string concatenation.
Also if you don't like on delimiter lining you can use an alternate.
encoded_client = (
'eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL'
'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr'
...
'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4='
)I think that looks nicer. And now you don't have to care about concatenating them.
As for your cuts/renames.
I personally would say that that good variable names are better than 81 characters in a line.
However normally you can do something to achieve both.
First, there is normally a problem if you are indenting more than say 5 times.
You could probably re-factor the code into two or more functions.
# This is kinda hard to read.
# But is allowed.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is just wrong.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is one way to solve the problem
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data)
)
# And an alternate
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data))I prefer the third way, however some prefer the forth way.
So if you see the third way and not the forth way in my review then either is good.
As for very long comprehensions I think splitting them into multiple lines is very good.
And you do that for the most part.
# Harder to read.
(chunk for chunk in my_list if chunk)
# Easier to read
(
chunk
for chunk in my_list
if chunk
)The above is what I find.
I know that I use the former more than the latter,
but I find the latter easier to read, and if
my_list was a generator,then you could easily merge them together.
General
-
global is used to change things in global scope when in a lower scope.While I can see
override_upload in a function, I can't change it.It's like there is a number on a display past some glass.
You can see it, and you know what it is.
But when you try to use your pen to overwrite it you just get the writing on the glass.
Now you can't see the number on display,
as your new writing looks like it's on display.
And everyone behind you sees yours as if it were the writing on display.
In short, you're not behind any glass. And so you don't need
global.-
Try to limit the size of
trys.You have two imports in a
try.Change this so there are two different
trys,and do in if if you need to limit the second
try.-
I hate
my_dict = {};my_dict['a'] = 'a'.It seems stupid, you can just put the letters in the dict.
This both stops you slaving away when changing the name of the dict.
And make it so that I know what the dict contains straight away.
my_dict = {'a': 'a'}.-
Using on delimiter line calls impairs your programs readability.
It's easier to just read on the left side of the screen,
then to jump from left to right.
-
Your
ISGlobals.get function can be removed if you are ok with changing the classes __dict__.self.__dict__.update(global_dict) is how you would do this.However if you can't change your program to not use strings you would have to use
getattr,and that can get ugly.
Or you could use the dictionary instead.
-
URL's are normally very alike to Unix paths.
Rather than re-inventing the wheel, you can use
posixpath.This will allow for possibly safer path changes as your current path methods aren't too safe.
-
I think mixing opening via URL and system is a very bad Idea.
I would rather as a developer have the option for one or the other,
I don't want to download some file, when I specified a file on the system.
-
Your
write is way too long.To be honest I kinda don't want to read something that large.
It's ~300 lines long, and has indents up to and including 6.
It should really be split up into more functions no matter how you look at it.
-
I think you really need to remove the Imgur implementation in this program.
All it does is add complexity to your program.
First I think that your class should have an 'encode' - put data in the image,
and a 'decode' - take information out of it.
Then you can pass a PIL image to it, and the functions do there magic.
No confusion over how to handle Imgur, no confusion over reading or writing and no compression.
Your aim should not be to throw as many features into this as possible.
(Just to note I'm not saying don't write this with Imgur support, that's a really cool idea.)
I just think Imgur should be designed afterwards.
And defiantly not implemented in the same class, let alone the same f
Code Snippets
encoded_client = (
'eJxtkEtPwzAQhO/+I+0JZf1YO0ckQIpUuL'
'T0GvmxaS2aEMUpUv89diooIC6RNd9kRrPr'
...
'R5MrigKutqcMgJbfCCZ7dhyd19ArUPmi4='
)# This is kinda hard to read.
# But is allowed.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is just wrong.
decoded_data = self.decode(''.join(chr(number)
for number in truncated_data))
# This is one way to solve the problem
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data)
)
# And an alternate
decoded_data = self.decode(
''.join(chr(number) for number in truncated_data))# Harder to read.
(chunk for chunk in my_list if chunk)
# Easier to read
(
chunk
for chunk in my_list
if chunk
)file_path = '...'
# The option for me to NOT use `load_url`.
image = (
load_url(file_path)
if is_url(file_path) else
load(file_path)
)
image = encode('My message', image, bits=4)
save_image(image)# Byte = 8 bits
# Chunk = `bits` bits
# Converts from a byte to bits then to chunks
def from_bytes(message, bits):
bit = 0
total = 0
for byte in message:
for _ in range(8):
bit += 1
total <<= 1
total += (byte & 128) >> 7
if bit == bits:
yield total
bit = 0
total = 0
byte <<= 1
while True:
yield 0
# Add image and message data.
def encode(message, image, bits):
NOT_BITS = ~((1 << bits) - 1)
return (
(byte & NOT_BITS) + data
for byte, data in zip(image, from_bytes(message, bits))
#for byte, data in itertools.izip(image, from_bytes(message, bits)) # Python2
)
# Modified decode. Non-8bits one.
# I removed the header for simplicity
def decode(image, bits):
image_data = ''.join(
str(bin(i))[2:].zfill(8)[8 - bits:]
for i in image
)
return [
int(j, 2)
for j in (
image_data[i:i + 8]
for i in range(0, len(image_data), 8)
)
]Context
StackExchange Code Review Q#106262, answer score: 9
Revisions (0)
No revisions yet.