HiveBrain — Knowledge Dashboard

snippet minor by @import:stackexchange-codereview 112d ago

I created a function to read lines from an file into chunks. My hidden agenda in creation this script was in python the yield function in interaction with chunks. The script works fine, but now i want to know if anyone has improvements? ``` #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; sub read_in_chunks { my $args = shift; my $self = { fd => $args->{fd} || undef, chunk_size => $args->{chunk_size} || 10, chunks => [], }; my $fh = $self->{fd}; return unless defined(my $line=); while(){ chomp($_); # maybe the following line could be written nicer :) ($self->{chunk_size} == 0) ? return $self->{chunks} : (push @{$self->{chunks}}, $_); $self->{chunk_size}--; } return $self->{chunks}; } open my $fh, 'dump.txt' or die $!; my $opts = { fd => $fh, chunk_size => 4 }; while(my $chunk = read_in_chunks($opts)) { print Dumper($chunk); # process data } close $fh; ```

fileiteratorcodereviewperlgenerator

pattern minor by @import:stackexchange-codereview 112d ago

"Acro Words" - Creating Acronyms of a Text that are Words

I wrote a program that given an input string returns a list of tuples in which each element of the tuple is a word given by dropping some number of characters of the original string, preserving order. For example: ``` 'valley' -> [('alley',), ('valley',)] 'friction' -> [('fiction',), ('friction',), ('ricin',)] 'university' -> [('unvest',), ('nesty',), ('unsty',), ...] 'freetrade' -> [..., ('erade',), ..., ('free', 'tade'), ..., ('free', 'trade'), ...('feer',), ...] ``` The algorithm I've written works as follows - Find the powerset of the characters in the string. This gives us a set of lists of characters, conceptually given by "dropping" some number of characters of the input string, preserving order. `powerset('abc') = {(),('a',),('a', 'b'),('a', 'b', 'c'),('a', 'c'),('b',),('b', 'c'),('c',)}` - Next, for each element of the powerset, we perform a "powersplit" in which we find every possible split (or, inversely, concatenation). `powerset(('a', 'b', 'c')) = {('a', 'b', 'c'), ('a', 'bc'), ('ab', 'c'), ('abc',)}` - Finally, for each tuple produced by the powersplit, we check if the contents of the tuple are in our dictionary. `contents_in_dictionary(dictionary, ('free', 'trade')) = True` The following is my code ``` from functools import reduce from itertools import chain, combinations, islice, zip_longest def powerset(data): return {z for z in chain.from_iterable((x for x in combinations(data, r)) for r in range(len(data)+1))} def slices(data, indices): return [[x for x in islice(data, *list(y))] for y in zip_longest(chain([0], indices), chain(indices, [len(data)]))] def powersplit(data): return {tuple([''.join(x) for x in slices(data, indices)]) for indices in powerset(range(1, len(data)))} def get_dictionary(min_length): return {line.rstrip('\n').lower() for line in open('/usr/share/dict/words') if len(line.rstrip('\n')) >= min_length} def contents_in_dictionary(dictionary, in_string): return all([y in dictionary for y in in_

codereviewstackoverflowgeneratoralgorithmpython

pattern minor by @import:stackexchange-codereview 112d ago

Transforming a list of two-dimensional coordinates into a flat list of relative changes

I have a little function that parse a nested list of coordinates into a flat list of compressed coordinates. By compressed coordinates, I mean that only the delta (distance) between each coordinates are stored in the list and the float coordinates are transformed in integer. ``` input = [[-8081441,5685214], [-8081446,5685216], [-8081442,5685219], [-8081440,5685211], [-8081441,5685214]] output = [-8081441, 5685214, 5, -2, -4, -3, -2, 8, 1, -3] def parseCoords(coords): #keep the first x,y coordinates parsed = [int(coords[0][0]), int(coords[0][1])] for i in xrange(1, len(coords)): parsed.extend([int(coords[i-1][0]) - int(coords[i][0]), int(coords[i-1][1]) - int(coords[i][1])]) return parsed parsedCoords = parseCoords(input) ``` As the input list is really big, is there a way to improve the function, maybe by using generators or list comprehension?

codereviewstackoverflowgeneratorpythonpython-2.x

pattern minor by @import:stackexchange-codereview 112d ago

Testing a Random number generator

Firstly, would appreciate some code-review feedback, from a TDD and design perspective. Secondly, what are your thoughts on implementing test case: `testNumbersAreUnique()`? The fact that API returns a `Set` of immutable objects means elements will be unique, but just wanted to know your thoughts. The reason I used a `do-while` as against a `while` or `for` loop (and iterate ‘x’ times to populate ‘x’ Random numbers) is to ensure the set has correct number of elements (which will of course be unique) even if the random number gives something that is already existing in the Set. The problem: Lottery Generator The goal of the program is to generate ‘x’ distinct lottery numbers within a given range 1 to ‘N’. Both x and N are int for simplicity. Following is my code. I wrote a controller, which will call service layer to fetch ‘x’ distinct lottery numbers: ``` public class Controller { private LotteryGeneratorService lotteryGeneratorService; @RequestMapping(method=RequestMethod.GET) public Set getLotteryNumbers(int x){ lotteryGeneratorService = new LotteryGeneratorService(); return lotteryGeneratorService.generateNRandom (x); } } ``` This is the Service class: ``` public class LotteryGeneratorService { private Set set = new HashSet<>(); private static int generateRandomInt(){ Random random = new Random(); return random.nextInt(N); } public Set generateNRandom(int x){ int tmp; do{ tmp =generateRandomInt(); if(tmp!=0) set.add(tmp); } while(set.size()<x); return set; } public int getSizeOfRandomlyGeneratedSet(){ return set.size(); } } ``` The tests: ``` public class LotteryTest { LotteryGeneratorService lotteryGeneratorService; int N; @Before public void setup() { lotteryGeneratorService = new LotteryGeneratorService(); N=100; } @After public void tearDown() { lotteryGeneratorService = null; } @Test public void testNumberOfNumbersGeneratedIsCorrect() { assertEqual

randomunit-testingcodereviewjavastackoverflow

pattern minor by @import:stackexchange-codereview 112d ago

Retrieving the nth term of an infinite stream

I recently learned how generators can be infinite in Python. For example, the infinite sequence can be the triangular numbers: $$1, 3, 6, 10,...$$ ``` def triangle_stream(): """ Infinite stream of triangle numbers. """ x = step = 1 yield x while True: step += 1 x += step yield x ``` I wrote a function that returns the nth term in an infinite sequence. ``` def nth_term(generator, n): """ Returns the nth term of an infinite generator. """ for i, x in enumerate(generator()): if i == n - 1: return x ``` As an example: ``` >>> nth_term(triangle_stream, 10) 55 ``` But this `nth_term` function also works: ``` def nth_term(generator, n): """ Returns the nth term of an infinite generator. """ x = generator() for i in range(n - 1): next(x) return next(x) ``` Which of these two `nth_term` functions is more efficient?

codereviewgeneratorstackoverflowperformancepython

pattern minor by @import:stackexchange-codereview 112d ago

Generator for the collatz conjecture sequence

I tried to write this code as concisely as possible. Is this the best way to do it? ``` def collatz(n): """ Generator for collatz sequence beginning with n >>> list(collatz(10)) [5, 16, 8, 4, 2, 1] """ while n != 1: n = n / 2 if n % 2 == 0 else 3*n + 1 yield int(n) ```

codereviewcollatz-sequencegeneratorstackoverflowalgorithm

pattern minor by @import:stackexchange-codereview 112d ago

Using generators to print the tree-like structure of a project

The goal of this code is to print the entity tree of a VHDL project. There are a readme and very minimal tests on the github repo. I am trying to refactor the code to use generators in order to familiarize myself with `yield`, and to think in terms of iterators, not raw data structures. Here is my current attempt at doing so: `import re from sys import argv from os import walk from os.path import join as pjoin EXCLUDES = ["implementation", "testbench"] BASIC_ID_REGEX = "[a-z][a-z0-9]*(?:_[a-z0-9]+)*" def _vhdltree(level, filepath, pattern, vhd_files): for entity, component in find_entities(filepath, pattern): """Codereview: I am not specifically interested in feedbacks for the following snippet (except about the recursive design), but if you have a particularly elegant solution which keeps the "UI" part very minimal, suggestions are welcome""" path = vhd_files.get(component.lower(), "Not Found") print(" "*level + entity + " : " + path) if path != "Not Found":#Probably ugly, but lazy _vhdltree(level+1, path, pattern, vhd_files) def find_entities(filepath, pattern): with open(filepath) as f: for l in f: m = pattern.match(l) if m: yield m.group('entity'), m.group('component').split(".")[-1] def find_vhd(proot): for (dirpath, dirnames, filenames) in walk(proot): if not isexcluded(dirpath.lower()): for fn in filenames: if fn[-4:].lower() == ".vhd": yield fn[:-4].lower(), pjoin(dirpath, fn) def isexcluded(path): for excluder in EXCLUDES: if excluder in path: return True return False def vhdltree(filepath, proot): instantiation_regex = ("\s*(?P{0})\s*:\s*entity\s*(?P{0}(?:\.{0})*)" # NOQA .format(BASIC_ID_REGEX)) p = re.compile(instantiation_regex, re.IGNORECASE) vhd_files = dict(find_vhd(proot)) _vhdl

iteratorcodereviewstackoverflowgeneratorpython

pattern minor by @import:stackexchange-codereview 112d ago

Concatenating two IEnumerables with a limit

I've been playing around with generators, generics and extension methods in C# (5.0) and wanted to create an extension method for `IEnumerable`, which would append another `IEnumerable` to it and allow a limit to be put on the whole thing. The way this differs from just using `Concat()` and then `Take()` is that instead of passing an `IEnumerable`, you pass a function that takes an `int` (which represents the amount of elements that are needed at maximum from the second `IEnumerable`) and returns an `IEnumerable`. Here's what I came up with: ``` using System; using System.Collections.Generic; using System.Linq; public static class EnumerableExt { public static IEnumerable ConcatLimiting(this IEnumerable first, Func> getSecondWithLimit, int limit = int.MaxValue) { int elems = 0; foreach (T elem in first.Take(limit)) { yield return elem; elems++; } foreach (T elem in getSecondWithLimit(limit - elems).Take(limit - elems)) { yield return elem; } yield break; } } ``` An example of a use case for this is when you need a limited amount of records from two different databases. You can write something like: ``` return GetEnumerableFromDatabase(someDatabase, 20) .ConcatLimiting(limit => GetEnumerableFromDatabase(someOtherDatabase, limit)) .ToList(); ``` My concerns: - Are the two `IEnumerables` going to be handled correctly, assuming that the resulting one is properly disposed of? - I have tried out quite a few possible scenarios and it seems to be handled correctly, but I'm not entirely confident. - Is the name `ConcatLimiting()` not very conform with what you usually see in C#? - Is this extension method completely useless? - Does an easy way to do this already exist? - Am I more or less following best practices? - Are there any other glaring issues?

linqcodereviewcsharpgeneratorstackoverflow

pattern minor by @import:stackexchange-codereview 112d ago

Extracting specific rows and columns from a CSV file

I have written a function to selectively extract data from a file. I want to be able to extract only from a certain line and only given rows. Would convert this function into a generator improve the overhead when I need to process large files? ``` import itertools import csv def data_extraction(filename,start_line,lenght,span_start,span_end): with open(filename, "r") as myfile: file_= csv.reader(myfile, delimiter=' ') #extracts data from .txt as lines return (x for x in [filter(lambda a: a != '', row[span_start:span_end]) \ for row in itertools.islice(file_, start_line, lenght)]) ```

codereviewcsvstackoverflowgeneratorperformance

pattern minor by @import:stackexchange-codereview 112d ago

Random name generation in ruby

I created this program as an exercise for me to learn ruby. It is a random name generator with a set of rules defined to generate pretty decent and usually pronouncable names. Ruby isn't exactly my favourite language, but I needed to learn it. The names produced consist of both latin letters and a few non-latin symbols like å, ä and ö. The code is fairly straight forward but I suspect there are a few things I do in the code that isn't considered good practise. It would be helpful in my path to learning ruby to get a code review on my program: ``` require 'unicode' # Constants containing all consonants and vowels in the latin alphabet + some # extra non-latin letters. The number after each letter represents how common # a letter should be CONS_LATIN = ['b']*100 + ['c']*100 + ['d']*100 + ['f']*100 + ['g']*100 + ['h']*100 + ['j']*100 + ['k']*100 + ['l']*100 + ['m']*100 + ['n']*100 + ['p']*100 + ['q']*85 + ['r']*100 + ['s']*100 + ['t']*100 + ['v']*100 + ['w']*50 + ['x']*75 + ['z']*50 VOWS_LATIN = ['a']*100 + ['e']*100 + ['i']*100 + ['o']*100 + ['u']*100 + ['y']*75 VOWS_EXTRA = ['ĳ']*75 + ['å']*100 + ['ä']*100 + ['ö']*100 + ['ø']*75 + ['æ']*60 # Banned combinations which are hard to pronounce or look weird BANNED_COMBOS = [['g','j'],['f','k'],['b','k'],['q','p'],['w','q'],['q','g'],['x','x'],['q', 'q'],['d','b']] def getRandomVowel # Only 10% chance to generate random "non-latin" vowel if rand() <= 0.1 return VOWS_EXTRA.sample else return VOWS_LATIN.sample end end def getRandomVowelNoDuplicates(str:) # Generate a random vowel and if it a non-latin vowel # then we only use it if it has not been previously used in str vowel = getRandomVowel while VOWS_EXTRA.include? vowel and str.include? vowel vowel = getRandomVowel end return vowel end def getRandomConsonante return CONS_LATIN.sample end def getLastCharactersFromString(str:, numChars:) return Unicode::downcase (str[-numChars, numChars].to_s)

randomcodereviewgeneratorstackoverflowgame