HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMajor

Simple and effective port checker in C++

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
simpleportandeffectivechecker

Problem

Intro

A couple of weeks ago I finished a Python implementation of a multithreaded port checker and I wasn't quite happy with the result I've got (speed). I needed it to be faster, so I've built another one using CPP (and a bit of C).

Here is the Python code, if anyone wants to see it.

At the moment, I didn't implemented the threading module for my script because I'd like to get some advice on what would be the wisest choice (while asking for an answer which adds this part is forbidden here, I wouldn't mind seeing an example of it)

Description

The program takes as command line arguments the following:

  • a text file which contains one domain name per line



  • the port number for which the above domains will be checked



  • number of threads



  • the timeout



  • a text file where the domains which have the port open will be written (along with their IP address) e.g: google.com:172.217.22.46



What the program is currently doing is to translate each domain into an IP address, and check if the given port is opened or not. If it's opened, write the domain:ip to an output file.

The program can be compiled and run using the below commands:

g++ port_checker.cpp -o checker
./checker domains.txt 80 2 2 output.txt


Code

```
#include
#include
#include
#include
#include
#include
#include

std::vector get_domains_from_file(std::string domains_file) {
std::vector domains_array;
std::ifstream file(domains_file);
std::string domain;

if (!file.is_open()) {
std::cerr > domain) {
domains_array.push_back(domain);
}

return domains_array;
}

int main(int argc, char *argv[]) {
struct hostent *h;
struct sockaddr_in servaddr;

int sd, rval;

if (argc != 6) {
std::cerr domains = get_domains_from_file(domains_file);

std::ofstream myfile;
myfile.open(output_file);

for (int i = 0; i h_addr)) h_addr, h -> h_length);

rval = connect(sd, (struct sockaddr *) &servaddr, sizeof(servaddr));

Solution

Checking TCP ports and multi-threading

Your performance should be fine for a sequential program. Another way to check whether the TCP port is open is by just sending the initial SYN and wait for the ACK. That way, you don't have to do the complete TCP handshake. However, that also means that you have to implement that part of the TCP protocol yourself. Also, some platforms don't allow using raw IP sockets for that for restricted users, which can pose another problem.

However, you cannot implement multi-threading at the moment, at least not over your current for loop. gethostbyname is not thread-safe. You have to use another function there, otherwise you will end up with data races and therefore undefined behaviour. GNU provides gethostbyaddr_r, which is an extension of the POSIX standard.

That being said, let's have a look at your current code.

Proper includes

#include 
#include 
#include 
#include 
#include 
#include 


You're missing several includes. cout and cerr need `, memcpy and memset need , exit needs cstdlib.

Be more user friendly

We can use your
get_domains_from_file as a case-study:

std::vector get_domains_from_file(std::string domains_file) {
    std::vector domains_array;
    std::ifstream file(domains_file);
    std::string domain;

    if (!file.is_open()) {
        std::cerr > domain) {
        domains_array.push_back(domain);
    }

    return domains_array;
}


At the moment, you simply
exit(-1) whenever you encounter a problem. I see two problems there:

  • The exit code doesn't yield any information, apart from "it worked" or "it failed"



  • You cannot recover the error, for example if you want to keep running even if you failed to create a socket for the 6000th domain.



Consider alternative input and output streams

Also, since we're currently in UNIX land, your program isn't very pipe-friendly. One can only get the domains from a file, but in a real-world scenario, it's likely that you have some logs which you want to check, e.g.

grep "acme" /var/log/attacks | ./domain-checker 80 4 2


We can at least make your program pipe-ready, if we use an
istream here:

std::vector get_domains_from_stream(std::istream & in) {
    std::vector domains_array;
    std::string domain;

    while (in >> domain) {
        domains_array.push_back(domain);
    }

    return domains_array;
}


You can implement your other function with
get_domains_from_stream:

std::vector get_domains_from_file(std::string domains_file) {
    std::ifstream file(domains_file);

    if (!file.is_open()) {
        std::cerr << "Unable to open domains file!" << std::endl;
        return {};
    } else {
        return get_domains_from_stream(file);
    }
}


Note that I've returned an empty vector in this case. It is more or less a point of preference here, whether to return an empty resource, throw an exception, or do something else. But from a user's point of view, the "I've specified an empty file, whoops" and "I've specified a non-existing file, whoops" are the same kind of error: the user gave the wrong argument to your program.

Better usage message and input handling

While we're at error messages, you could improve your argument usage error message:

std::cerr << "The number of arguments should be 5 e.g: "
          << arg[0] << " domains.txt [port] [threads] [timeout] output.txt" 
          << std::endl;


You have no idea what the application is called on the user's PC, so just use
arg[0] for that.

Speaking of arguments, how does your program work with invalid input?

$ ./a.out domains.txt abc def ghi blarg.out


No error message, no exception. That's due to
atoi, which simply returns 0. Prefer std::stoi instead, which throws an exception if something is off:

struct hostent *h;             // hm, what are these?
struct sockaddr_in servaddr;   // where do you use them?

int sd, rval;                  // what are these ints?

const std::string domains_file = argv[1];
const std::string output_file  = argv[5];
const int port                 = std::stoi(argv[2]);
const int threads              = std::stoi(argv[3]);
const int timeout              = std::stoi(argv[4]);


By the way, my criticism about the input also holds for your output.
cout is usually fine for output. If a user wants to write to a file, they can just use redirection:

$ grep "acme.evil" | ./checker 80 4 2 > evil.ip


Limit the scope of your variables as much as possible

Still in the same section of your program, you define
int sd, int rval, struct hostent *h; and other's far too early. You always try to limit the scope of your variables. Therefore, get rid of them at that point. Also, use const a little bit more often to ensure that you don't change things. It will also help the compiler to optimize your program in several cases. Not necessarily in your current one, but it's usually a good idea.

The
for`-loop

Now, after above dis

Code Snippets

#include <fstream>
#include <vector>
#include <sys/socket.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <unistd.h>
std::vector<std::string> get_domains_from_file(std::string domains_file) {
    std::vector<std::string> domains_array;
    std::ifstream file(domains_file);
    std::string domain;

    if (!file.is_open()) {
        std::cerr << "Unable to open domains file!" << std::endl;
        std::exit(-1);
    }

    while (file >> domain) {
        domains_array.push_back(domain);
    }

    return domains_array;
}
grep "acme" /var/log/attacks | ./domain-checker 80 4 2
std::vector<std::string> get_domains_from_stream(std::istream & in) {
    std::vector<std::string> domains_array;
    std::string domain;

    while (in >> domain) {
        domains_array.push_back(domain);
    }

    return domains_array;
}
std::vector<std::string> get_domains_from_file(std::string domains_file) {
    std::ifstream file(domains_file);

    if (!file.is_open()) {
        std::cerr << "Unable to open domains file!" << std::endl;
        return {};
    } else {
        return get_domains_from_stream(file);
    }
}

Context

StackExchange Code Review Q#155755, answer score: 30

Revisions (0)

No revisions yet.