Recent Entries 10
- pattern minor 112d agoPython breadth first search algorithmAny pointers, or any advice on how to clean up my code/make more pythonic would be appreciated. ``` graph = {'A': ['B','Y','D', 'E',], 'B': ['F','F'], 'C': ['F'], 'D': [], 'E': ['F','G'], 'F': ['F','A'], 'G': ['E','K'], 'K': ['M','L'] } def bfs(graph,start,end): vertex = [start] history = [] # create new node to be iterated through # update history # update vertex while vertex: # while vertex is not empty. len(li) == 0: node = vertex[0] # get the 0th element of the vertex history.append(node) print(node) vertex.pop(0) # pop the 0th element of the vertex if node == end: return end # iterate through the graph. gather all of the values per node for i in graph.get(node, '/'): # if key does not exist, print '/' if i: #if i is not empty vertex.append(i) # append the dict values into vertex list print('vertex',vertex) if i in history: vertex.pop(-1) ```
- pattern minor 112d agoPython CGI front-end for web service to perform machine translationI am trying to optimize this python script that is used to process web requests for machine translation. The actual translation executable that is called is quite fast. Also, the perl scripts that are called are fast as well. The largest performance boost came from removing unnecessary import libraries. I would like to have this code reviewed so I can further optimize the performance. Also, I welcome any advice on a pythonic way of testing performance. My code is littered with timing and print commands that I removed for this post. ``` #!/usr/bin/env python # -*- coding: UTF-8 -*- import time import sys import cgi import subprocess import string import xmlrpclib reload(sys) sys.setdefaultencoding('utf8') isTestPerformance = len(sys.argv) == 4 # Parameters if isTestPerformance: source = sys.argv[1] target = sys.argv[2] sourceText = sys.argv[3] else: # this part is important to tell the browser that output is html text. print "Access-Control-Allow-Origin: *" print "Content-Type: text/plain;charset=utf-8" print form = cgi.FieldStorage() sourceText = form.getvalue("sourceText").decode('utf8') source = form.getvalue("source").lower() target = form.getvalue("target").lower() # Decode the CGI encoded source text # NOTE: Custom encoding of semicolon (;), (?), (&), (#), etc, is only done here because # CGI can not handle them. Do not used this (decode) if you are not using CGI, # or use some other decoding that matches the encoding from the caller of this code sourceText = sourceText.replace("__QUESTION_MARK__", "?") sourceText = sourceText.replace("__SEMICOLON__", ";") sourceText = sourceText.replace("__AMPERSAND__", "&") sourceText = sourceText.replace("__NUMBER__", "#") # sourceText = sourceText.replace("__NEWLINE__", "\n") # Tokenize the Source Text if source == "zh": # Chinese has to do word alignment # options are slim: write the text to a file # use NLTK Stanford NLP (python>java) to segment chinese
- principle minor 112d agoDoodlebug vs. ant population simulationI am looking for a review on one of my homework assignments for this semester. This homework has already been submitted and graded and my final has already been submitted so there is no cheating or conflict of interest in my review request! I would love some advice on how to better manage my class interactions and how to better encapsulate data. The specific area I struggle with is when using parent classes and giving access to only protected assets from child classes. My professor has mentioned on several occasions that it would be much better to keep data private and give access through functions, but how would I initialize those member variables when instantiating an instance of a child class that has private members in the parent class? I understand the idea of encapsulation is to protect data that shouldn't be manipulated by outside programmers or irrelevant classes. I was the sole developer on this project so I understand in this specific example encapsulation may not be paramount, but on a larger project with multiple engineers it would certainly be relevant. Includes ``` #include "stdafx.h" #include #include #include using namespace std; ``` ** Note the coordinates struct just contains an integer xCoordinate and an integer yCoordinate main ``` int main() { //Create environment object containing environment antDoodlebugSimulation; antDoodlebugSimulation.InitializeSimulation(); return 0; } ``` Environment ``` class environment { //friends of environment friend class organism; friend class doodlebug; friend class ant; private: organism * environmentBoard[20][20]; void CreateStartPopulation(); int GenerateRandomStartingLocations(int min, int max); void OutputCurrentEnvironment(); void DoodlebugsAct(); void AntsAct(); void ResetCritterTimeStep(); public: //constructor environment(); //deconstructor ~environment(); //public member functions void Initializ
- snippet minor 112d agoRead lines from a file in chunksI created a function to read lines from an file into chunks. My hidden agenda in creation this script was in python the yield function in interaction with chunks. The script works fine, but now i want to know if anyone has improvements? ``` #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; sub read_in_chunks { my $args = shift; my $self = { fd => $args->{fd} || undef, chunk_size => $args->{chunk_size} || 10, chunks => [], }; my $fh = $self->{fd}; return unless defined(my $line=); while(){ chomp($_); # maybe the following line could be written nicer :) ($self->{chunk_size} == 0) ? return $self->{chunks} : (push @{$self->{chunks}}, $_); $self->{chunk_size}--; } return $self->{chunks}; } open my $fh, 'dump.txt' or die $!; my $opts = { fd => $fh, chunk_size => 4 }; while(my $chunk = read_in_chunks($opts)) { print Dumper($chunk); # process data } close $fh; ```
- snippet minor 112d agoCapture worksheet formulas in VBA formatI had been searching for a simple way to capture worksheet formulas in VBA format. I came up with a solution below, which I wanted to share. With any luck this could be useful to someone down the road. Suggestions for improvement are welcome. ``` Public Const vbQuadrupleQuote As String = """""" 'represents 2 double quotes for use in VBA R1C1 formulas ("") Public Const vbDoubleQuote As String = """" 'represents 1 double quote (") Public Const vbSingleQuote As String = "'" 'represents 1 single quote (') Sub CaptureFormulas() 'simplifies the capturing of worksheet formulas in VBA format 'Peter Domanico, May 2017 'Steps: '(1) place this script in your personal macro workbook '(2) open Immediate Window in VBA (Control + G) '(3) run this script and follow prompts '(4) a With statement containing formulas for your selection will be printed to the Immediate Window '(5) you can use this With statement in any script Dim ws As String Dim rng As Range Dim MyString As String Dim MyColumn As Variant Dim MyRow As Variant Dim LastRow As String Dim MyRange As String Dim MyFormula As String 'set worksheet string ws = "Activesheet" 'change this as needed 'error handling On Error GoTo OuttaHere 'select range Set rng = Application.InputBox("Select range to capture", ": )", Type:=8) 'determine formula type MyQuestion = MsgBox(Prompt:="Fill formulas to last row?", _ Buttons:=vbYesNo, Title:="???") Debug.Print "Dim ws as Worksheet" 'change this as needed Debug.Print "Set ws = Activesheet" 'change this as needed Debug.Print "LastRow = ws.Cells(Rows.Count,1).End(xlUp).Row" 'change this as needed Debug.Print "With ws" 'change this as needed For Each rng In rng MyColumn = rng.Column CurrentRow = rng.Row Select Case MyQuestion Case vbYes LastRow = "LastRow" Case vbNo LastRow = CurrentRow End Select MyRange = ".Range(.Cells(" & CurrentRow & "," & MyColumn & "),.Cells(" & LastRow & "," & MyColumn & "))=" MyF
- pattern minor 112d agoFind factors of a numberI am a beginner in C. I have just written this program to find the factors of a provided number \$n\$ where \$1\leq n \leq 10^9\$. However, when I input large numbers (e.g. the maximum, \$10^9\$), the program takes a long time to finish finding the larger factors. How do I reduce its time taken? Also, are there any possible improvements for this code? ``` #include int main(){ int a,i; scanf("%d",&a); for(i=1;i<(a/2+1);i++){ if(a%i==0){ printf("%d\n",i); } } printf("%d\n",a); return 0; } ```
- pattern minor 112d agoUnit-testing event aggregatorI have written event aggregator with the following API (just for the fun of it, I am aware that nuget has like 100 similar implementations): ``` /// /// Events aggregator /// public interface IMessenger { /// /// Subscribes an object to all relevant events. /// /// Object, that implements one or more IListener interfaces /// Subscription handle. Dispose to unsubscribe. IDisposable Subscribe(object listener); /// /// Subscribes a delegate to TMessage event. /// /// Event handler. /// Subscription handle. Dispose to unsubscribe. IDisposable Subscribe(Action handler) where TMessage : IMessage; /// /// Sends message to event pipeline. /// /// Message to send. /// Awaitable task, that returns true if message was processed successfully. Otherwise - false. Task PublishAsync(IMessage message); } /// /// Message that can be delivered to subscribers via IMessenger. /// public interface IMessage { /// /// Whether or not message was handled by subscriber. /// bool Handled { get; set; } } /// /// Implementaions of this interface are recognized by IMessenger as event handlers for TMessage. /// /// Type of message. public interface IListener where TMessage : IMessage { /// /// This method is called by IMessenger when TMessage is published. /// /// Message received by IMessenger. void Handle(TMessage message); } ``` And here are some unit tests I have written in order to test my implementation: ``` [TestFixture] class MessengerTests { [Test] public void Subscribe_OnNullListener_Throws() { var messenger = new Messenger(); IListener listener = null; Assert.Throws(() => messenger.Subscribe(listener)); } [Test] public void Subscribe_OnNotListener_Throws() { var messenger = new Messenger(); var listener = new object(); Assert.Throws(() => messenger.Subscribe(listener));
- pattern minor 112d agoPreprocessing steps to follow while cleaning and extracting text data from tweetsI have a dataset of around 200,000 tweets. I am running a classification task on them. Dataset has two columns - class label and the tweet text. In the preprocessing step I am passing the dataset through following cleaning step: ``` import re from nltk.corpus import stopwords import pandas as pd def preprocess(raw_text): # keep only words letters_only_text = re.sub("[^a-zA-Z]", " ", raw_text) # convert to lower case and split words = letters_only_text.lower().split() # remove stopwords stopword_set = set(stopwords.words("english")) meaningful_words = [w for w in words if w not in stopword_set] # join the cleaned words in a list cleaned_word_list = " ".join(meaningful_words) return cleaned_word_list def process_data(dataset): tweets_df = pd.read_csv(dataset,delimiter='|',header=None) num_tweets = tweets_df.shape[0] print("Total tweets: " + str(num_tweets)) cleaned_tweets = [] print("Beginning processing of tweets at: " + str(datetime.now())) for i in range(num_tweets): cleaned_tweet = preprocess(tweets_df.iloc[i][1]) cleaned_tweets.append(cleaned_tweet) if(i % 10000 == 0): print(str(i) + " tweets processed") print("Finished processing of tweets at: " + str(datetime.now())) return cleaned_tweets cleaned_data = process_data("tweets.csv) ``` And here is the relevant output: ``` Total tweets: 216041 Beginning processing of tweets at: 2017-05-16 13:45:47.183113 Finished processing of tweets at: 2017-05-16 13:47:01.436338 ``` It's taking approx. 2 minutes to process the tweets. Although it looks relatively a small timeframe for current dataset I would like to improve it further especially when I use a dataset of much bigger size. Can the steps/code in the `preprocess(raw_text)` method be improved in order to achieve faster execution?
- pattern minor 112d agoExtracting the IP addresses of Docker containers using JSON APIThe following piece of code is imported in another file to get the status of the docker containers. I am printing the required information in a proper tabular form. I am using next() for finding out the only key available in the dictionary. This particular key will change hence used next() to find out the key. But the next() raises an exception when it reaches to the end. Currently I a handling that using PASS. My question is "Is there a better to way to handle the exception caused by next () : StopIteration" ? ``` import requests from prettytable import PrettyTable def container_status(status): url = None if status == "all": url = "http://127.0.0.1:6000/containers/json?all=1" elif status == "running" : url = "http://127.0.0.1:6000/containers/json?all" else: raise ValueError("status should be either 'all' or 'running'") return requests.get(url) def active_containers(status): response = container_status(status) table = PrettyTable(["Container Name", "Container ID", "Status", "IP ADDR"]) for i in response.json(): try: table.add_row([i["Names"][0].encode('utf-8').replace('/', ''), i['Id'].encode('utf-8')[:12], i["State"], i["NetworkSettings"]["Networks"][next(iter(i["NetworkSettings"]["Networks"]))]["IPAddress"]]) except StopIteration: pass print(table) ``` What are the other possibilities where I can improve the code?
- pattern minor 112d ago"The Genuine Sieve of Eratosthenes" in C++14The following C++ code implements the "Genuine Sieve of Eratosthenes" algorithm as described in Melissa O'Neill's classic paper. On my MacBook it computes the first 1,000,000 primes in about 11 seconds. ``` $ ./a.out | head -1000000 | tail -1 15485863 $ ``` Just looking for general code review comments here. At least two parts of `sieverator` smell really bad to me, but I'm not sure of the proper way to fix them while still preserving the general "STLishness" of this code. For example, I really want to keep using the ranged for-loop in `main`. ``` #include #include #include #include #include #include template class iotarator { Int value = 0; Int step = 1; public: explicit iotarator() = default; explicit iotarator(Int v) : value(v) {} explicit iotarator(Int v, Int s) : value(v), step(s) {} Int operator*() const { return value; } iotarator& operator++() { value += step; return *this; } iotarator operator++(int) { auto ret = *this; ++*this; return ret; } bool operator==(const iotarator& rhs) const { return value == rhs.value && step == rhs.step; } bool operator!=(const iotarator& rhs) const { return !(*this == rhs); } }; template class sieverator { struct erased_iterator { virtual Int dereference() = 0; virtual void increment() = 0; }; template class derived_iterator : public erased_iterator { It it; public: derived_iterator(It it) : it(std::move(it)) {} Int dereference() override { return *it; } void increment() override { ++it; } }; Int m_current; std::unique_ptr m_ptr; explicit sieverator() {} // used by .end() public: template explicit sieverator(It it) : m_current(*it), m_ptr(std::make_unique>(std::move(it))) {} sieverator begin() { return std::move(*this); } sieverator end() const { return sieverator{}; } bool operator==(const sieverator&) const { return false; } bool op