patternphplaravelMinor
Document Merger using PhpDocX
Viewed 0 times
mergerusingdocumentphpdocx
Problem
I'm developing a document merger that utilizes an FTP site containing hundreds of documents.
FTP Connection Function
I have a function that connects to the FTP site and generates an array of the documents stored inside a designated directory. It will remove the extensions from any files that are of type
Form Generation Function
My view is very basic right now just to prove functionality. There is lots of design work that needs done to make it more user-friendly and functional.
Blade View
Lastly, I have a function that will accept an array of string values from the GUI. It will then establish a FTP connection and grab all documents requested from the FTP site. It will store these in a temp directory until the documents are merged and downloaded to the browser. Then it will unlink all temp files.
Merge and Download Documents Function
```
public static function mergeDocument() {
$ftp_conn = self::establishFT
FTP Connection Function
private static function establishFTP() {
$ftp_conn = ftp_connect(getenv('FTP_HOST'));
ftp_login($ftp_conn,getenv('FTP_USER'),getenv('FTP_PASS'));
ftp_pasv($ftp_conn,true);
return $ftp_conn;
}I have a function that connects to the FTP site and generates an array of the documents stored inside a designated directory. It will remove the extensions from any files that are of type
.docx. Then it will filter out all values containing a .. This removes any extra files that are not a Word document and also removes the FTP root pathings of . and .. from the array.Form Generation Function
public static function generateForm() {
$ftp_conn = self::establishFTP();
$forms = ftp_nlist($ftp_conn,getenv('FTP_DIRECTORY'));
$forms = str_replace('.docx','',$forms);
$forms = array_filter($forms,function($value) {
return strpos($value,'.') === false;
});
ftp_close($ftp_conn);
return view('testDoc')->with(array('forms'=>$forms));
}My view is very basic right now just to prove functionality. There is lots of design work that needs done to make it more user-friendly and functional.
Blade View
Form
{{ Form::open(array('action'=>'DocumentController@mergeDocument')) }}
@foreach($forms as $doc)
{{ Form::checkbox('documents[]',$doc) }}
{{ Form::label(null,$doc) }}
@endforeach
{{ Form::submit('Submit',array('id'=>'submitBtn')) }}
{{ Form::close() }}
Lastly, I have a function that will accept an array of string values from the GUI. It will then establish a FTP connection and grab all documents requested from the FTP site. It will store these in a temp directory until the documents are merged and downloaded to the browser. Then it will unlink all temp files.
Merge and Download Documents Function
```
public static function mergeDocument() {
$ftp_conn = self::establishFT
Solution
Security: Credentials in ENV
It is less than ideal to store your credentials in environment variables, as they may easily leak.
Of course, you don't want to allow access to phpinfo to just anyone either, but it may still leak, and not everyone who should be allowed to see phpinfo should be allowed to see the FTP credentials.
Security: Directory Traversal which may lead to Information Leak and Limited DOS
An attacker can delete arbitrary .docx files in arbitrary locations because you never check the input for directory traversal. The problematic code is this:
The impact of this is rather small, but depending on what you store on your server, there may be some damage from this.
As you actually know what files are acceptable, you can easily use a whitelist approach here. At a minimum, you should check if the normalized path is inside the root dir.
Alternatively, if you do decide to cache filenames locally (see below), you could simply pass ids to the user instead of filenames, avoiding the problem altogether.
The same problem also exists here:
An attacker could read .doxc files outside of the
Security: SSL
Your FTP connection will not be encrypted, meaning that a theoretical man in the middle can intercept your FTP credentials or the transfered files.
This may or may not be a problem for your use-case.
An alternative would be ftp_ssl_connect or using sftp (see also here).
Security/Funktionality: DOS & Performance
From what I can tell, each time I visit the website containing the form that lists files I can join, you open an FTP connection and retrieve a list of all the files (which may be quite a lot).
A user (or multiple user) may accidentally or on purpose reload the form a bunch of times, causing performance problems.
I would definitely profile this and see what strain it puts on your server. It may make sense to cache the list of docs locally (if it doesn't change that often).
Usability / Bug / Security: File Listing
Your mechanism will hide all files which contain a dot in their filename, which may lead to hard to trace bugs.
Your current mechanism may also have security implications, as files with no extension (for example
Instead, I would probably use an approach such as this:
Naming
You use
It is less than ideal to store your credentials in environment variables, as they may easily leak.
phpinfo for example will print all environment variables. Of course, you don't want to allow access to phpinfo to just anyone either, but it may still leak, and not everyone who should be allowed to see phpinfo should be allowed to see the FTP credentials.
Security: Directory Traversal which may lead to Information Leak and Limited DOS
An attacker can delete arbitrary .docx files in arbitrary locations because you never check the input for directory traversal. The problematic code is this:
$forms = Input::get('documents');
[...]
foreach($forms as $doc) {
unlink("temp/files/$doc.docx");
}The impact of this is rather small, but depending on what you store on your server, there may be some damage from this.
As you actually know what files are acceptable, you can easily use a whitelist approach here. At a minimum, you should check if the normalized path is inside the root dir.
Alternatively, if you do decide to cache filenames locally (see below), you could simply pass ids to the user instead of filenames, avoiding the problem altogether.
The same problem also exists here:
$forms = Input::get('documents');
[...]
foreach($forms as $doc) {
ftp_get($ftp_conn,"temp/files/$doc.docx",getenv('FTP_DIRECTORY') . "/$doc.docx",FTP_BINARY);
$mergeDocs[] = "temp/files/$doc.docx";
}An attacker could read .doxc files outside of the
FTP_DIRECTORY (if the FTP server allows this, and if the local server is writable at the given directory, both rather big ifs, but I would still protect against this).Security: SSL
Your FTP connection will not be encrypted, meaning that a theoretical man in the middle can intercept your FTP credentials or the transfered files.
This may or may not be a problem for your use-case.
An alternative would be ftp_ssl_connect or using sftp (see also here).
Security/Funktionality: DOS & Performance
From what I can tell, each time I visit the website containing the form that lists files I can join, you open an FTP connection and retrieve a list of all the files (which may be quite a lot).
A user (or multiple user) may accidentally or on purpose reload the form a bunch of times, causing performance problems.
I would definitely profile this and see what strain it puts on your server. It may make sense to cache the list of docs locally (if it doesn't change that often).
Usability / Bug / Security: File Listing
Your mechanism will hide all files which contain a dot in their filename, which may lead to hard to trace bugs.
Your current mechanism may also have security implications, as files with no extension (for example
secret) are listed as well, even though they shouldn't be.Instead, I would probably use an approach such as this:
- Filter out all files that do not have the docx extension
- Remove the .docx extension
Naming
You use
forms in multiple places as variable name where it doesn't really make sense. The variable doesn't contain multiple forms, but actually multiple document names. This naming also leads to the odd to read foreach($forms as $doc) loop. Instead, you should name name these variables $docs (in mergeDocument, generateForm, and in the view).Code Snippets
$forms = Input::get('documents');
[...]
foreach($forms as $doc) {
unlink("temp/files/$doc.docx");
}$forms = Input::get('documents');
[...]
foreach($forms as $doc) {
ftp_get($ftp_conn,"temp/files/$doc.docx",getenv('FTP_DIRECTORY') . "/$doc.docx",FTP_BINARY);
$mergeDocs[] = "temp/files/$doc.docx";
}Context
StackExchange Code Review Q#138868, answer score: 5
Revisions (0)
No revisions yet.