Author Archives: Taro

Using NTP Server to Synchronize Time on Debian Jessie

For just one time synchronization, it is easy: The server to use can be specified from the list given here.

Posted in Uncategorized | Tagged , , | Leave a comment

Installing MongoDB on Debian Jessie

On Debian, it is of course easy to install: My main use of local database servers is for testing, however, and I don’t want MongoDB to take up more than a few GB under /var/lib for journal files. I can … Continue reading

Posted in Uncategorized | Leave a comment

Installing PyData Stack on Debian Jessie

Installing NumPy, SciPy, and Matplotlib has gotten so much easier with PIP, but there are some dependencies that are not taken care of automatically. NumPy: SciPy: Matplotlib: I had some issue with sudo when the installer could not find X. … Continue reading

Posted in Uncategorized | Leave a comment

Installing PostgreSQL on Debian Jessie

where yourusername is the username of your account. Note that with the -s switch the user will be created as a superuser. For assigning a more restricted role, see the official documentation. Go back to your normal shell, and do … Continue reading

Posted in Uncategorized | Leave a comment

Interpreting A/B Test using Python

Suppose we ran an A/B test with two different versions of a web page, and , for which we count the number of visitors and whether they convert or not. We can summarize this in a contingency table showing the … Continue reading

Posted in Uncategorized | Tagged , , | Leave a comment

Brand Positioning by Correspondence Analysis

I was reading an article about visualization techniques using multidimensional scaling (MDS), the correspondence analysis in particular. The example used R, but as usual I want to find ways to do it on Python, so here goes. The correspondence analysis … Continue reading

Posted in Uncategorized | Tagged , , | Leave a comment

Using Custom Theme with SyntaxHighlighter Evolved

I have been using SyntaxHighlighter Evolved for displaying code snippets on this site. While the WordPress plugin has been working very well, I seem to lose my custom CSS styles every time the updated plugin gets installed. I want to … Continue reading

Posted in Uncategorized | Tagged , | 2 Comments

PCA and Biplot using Python

There are several ways to run principal component analysis (PCA) using various packages (scikit-learn, statsmodels, etc.) or even just rolling out your own through singular-value decomposition and such. Visualizing the PCA result can be done through biplot. I was looking … Continue reading

Posted in Uncategorized | Tagged , , , , | 2 Comments

Near-duplicate Detection using MinHash: Background

There are numerous pieces of duplicate information served by multiple sources on the web. Many news stories that we receive from the media tend to originate from the same source, such as the Associated Press. When such contents are scraped … Continue reading

Posted in Uncategorized | Tagged , | 2 Comments

A Trick for Computing the Sum of Geometric Series

Say if I need to compute the sum of a series like this one: (1)   where . This series looks like a geometric series in which case the sum can be computed from     The coefficients vary, so … Continue reading

Posted in Uncategorized | Tagged | Leave a comment