Links

Lists

Latest Updates

Ruby On Rails List
Python list
Advanced Java
The JavaScript List
Apache Users
Full Disclosure
Linux Security

Search the archives!


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

frequency analysis of a DB column


  • From: gagsl-py2 at yahoo.com.ar (Gabriel Genellina)
  • Subject: frequency analysis of a DB column
  • Date: Thu, 02 Aug 2007 00:38:28 -0300

En Wed, 01 Aug 2007 23:21:53 -0300, goldtech <goldtech at worldpost.com>  
escribi?:

> In Python 2.1 are there any tools to take a column from a DB and do a
> frequency analysis - a breakdown of the values for this column?
>
> Possibly a histogram or a table saying out of 500 records I have one
> hundred and two "301" ninety-eight "212" values and three-hundred
> "410"?
> Is SQL the way to for this?

I'd start with:

select column, count(column), min(column), max(column)
 from table
group by column
order by count(column) desc

and then build an histogram from that (using PyChart for instance). Based  
on this distribution curve, one can refine the analysis in a lot of ways...

> Of course there'd be 1000's of values....

Should not be a problem for today's DBMS and hardware...

-- 
Gabriel Genellina