Near-Duplicate Detection using MinHash: Background

Written by Taro Sato, on April 12, 2014. Tagged: stats Python math

There are numerous pieces of duplicate information served by multiple sources on the web. Many news stories that we receive from the media tend to originate from the same source, such as the Associated Press. When such contents are scraped off the web for archiving, a need may arise to categorize documents by their similarity (not in the sense of the meaning of the text but the character-level or lexical matching). ... Continue reading.

A Trick for Computing the Sum of Geometric Series

Written by Taro Sato, on April 4, 2014. Tagged: math

Say if I need to compute the sum of a series like this one: \begin{equation} y = 1 + 2 x + 3 x^2 + 4 x^3 + \dots \ , \label{eq:a} \end{equation} where \(|x| < 1\). This series looks like a geometric series in which case the sum can be computed from ... Continue reading.

Half-Light Radii for Various Profiles

Written by Taro Sato, on February 14, 2011. Tagged: astro math

For a radial profile of \(I( r)\), the enclosed flux within the radius \(r\) is given by \begin{equation*} F( r) = \int_{0}^{2 \pi} d\phi \int_{0}^{r} dr r I(r, \phi) \ . \end{equation*} I’m only concerned about azimuthal symmetric cases, so \(F( r) = 2\pi \int_{0}^{r} dr r I( r)\) . ... Continue reading.

Biboroku

Tagged: Math

Near-Duplicate Detection using MinHash: Background

A Trick for Computing the Sum of Geometric Series

Half-Light Radii for Various Profiles