Alexa generously provides its list of top 1,000,000 websites as a CSV to download from Amazon S3:
http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
The following Terminal-cast shows how to download the zip file via curl
, then unzip
it, then use head
to see the top 10, and then use grep
to look where nytimes.com is on the list – and the 5 things above and below it.
Here’s just the commands, sans the stdout:
$ curl -o topsites.zip http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
$ unzip topsites.zip
$ head top-1m.csv
$ grep -C 5 '\bnytimes.com'
Other example of easy analyses
Top 50 top-level domains by frequency:
$ ack -ho '(?<=\.)\w+$' | sort | uniq -c | sort -rn | head -n 50
502021 com
51998 ru
51298 net
45065 org
24728 jp
22880 de
17604 in
16075 uk
14515 br
12609 cn
12601 ir
12103 it
10918 pl
10284 fr
10132 info
7288 au
7073 nl
6854 es
6253 gr
5599 ua
5380 co
4618 tw
4584 ca
4464 cz
4416 tr
4160 eu
4140 kr
3847 id
3799 tv
3583 me
3556 ro
3542 mx
3363 biz
3344 edu
3184 vn
3085 za
3048 se
3036 hu
2509 us
2403 xyz
2401 ch
2302 ar
2243 be
2178 at
2100 dk
1988 io
1919 sk
1851 no
1792 cc
1714 fi
Finding the top 100 .edu domains (e.g. U.S. academic institutions)
$ grep '\.edu$' top-1m.csv | head -n 100
Result:
721,academia.edu
745,mit.edu
1064,harvard.edu
1077,stanford.edu
1393,psu.edu
1519,berkeley.edu
1548,purdue.edu
1758,cornell.edu
1801,umich.edu
2023,ucla.edu
2067,washington.edu
2150,columbia.edu
2264,wisc.edu
2386,umn.edu
2397,utexas.edu
2449,nyu.edu
2544,illinois.edu
2616,upenn.edu
2748,ucdavis.edu
3001,cuny.edu
3052,ucsd.edu
3089,usc.edu
3283,cmu.edu
3334,ufl.edu
3341,msu.edu
3425,umd.edu
3469,unc.edu
3536,princeton.edu
3565,asu.edu
3841,yale.edu
3860,uchicago.edu
3864,tamu.edu
3876,rutgers.edu
3934,duke.edu
4021,bu.edu
4128,jhu.edu
4254,iu.edu
4290,osu.edu
4315,uci.edu
4381,ncsu.edu
4395,utah.edu
4448,northwestern.edu
4513,virginia.edu
4779,arizona.edu
4802,phoenix.edu
4864,colorado.edu
5307,gatech.edu
5334,usg.edu
5368,vt.edu
5530,liberty.edu
5644,ucsb.edu
5781,vanderbilt.edu
5798,byu.edu
5826,oregonstate.edu
6177,gsu.edu
6477,pitt.edu
6594,iastate.edu
6643,si.edu
6679,umass.edu
6694,uga.edu
6723,uiowa.edu
6736,uw.edu
6746,colostate.edu
6753,wustl.edu
6754,usf.edu
6857,snhu.edu
7058,hawaii.edu
7099,wisconsin.edu
7137,indiana.edu
7188,gwu.edu
7324,tufts.edu
7352,uh.edu
7458,georgetown.edu
7478,wsu.edu
7658,umuc.edu
7818,ucf.edu
7832,unl.edu
7838,gmu.edu
7863,nd.edu
8255,ucsc.edu
8403,brown.edu
8422,gcu.edu
8522,uoregon.edu
8575,rochester.edu
8611,vccs.edu
8717,fsu.edu
8780,ku.edu
8941,emory.edu
8980,ucr.edu
9014,mnscu.edu
9057,buffalo.edu
9073,uky.edu
9138,missouri.edu
9314,ohio-state.edu
9436,caltech.edu
9456,uic.edu
9585,ucsf.edu
9655,dartmouth.edu
9855,neu.edu
9882,uconn.edu
Top 100 .gov
sites (U.S. government domains)
220,nih.gov
424,irs.gov
513,ca.gov
808,weather.gov
946,state.gov
993,ed.gov
1259,nasa.gov
1432,ny.gov
1490,cdc.gov
1512,noaa.gov
1639,ssa.gov
2246,wa.gov
2327,usajobs.gov
2437,va.gov
2465,nps.gov
2508,uscis.gov
2524,usda.gov
2582,dhs.gov
3171,nyc.gov
3318,usembassy.gov
3658,texas.gov
3884,healthcare.gov
4059,fda.gov
4196,whitehouse.gov
4351,usgs.gov
4352,uspto.gov
4368,virginia.gov
4501,sec.gov
4655,uscourts.gov
4672,ohio.gov
4718,loc.gov
4758,in.gov
5024,census.gov
5134,michigan.gov
5266,mo.gov
6050,epa.gov
6103,usa.gov
6240,utah.gov
6364,maryland.gov
6631,illinois.gov
6743,pa.gov
7009,bls.gov
7067,hhs.gov
7214,ga.gov
7357,mass.gov
8010,studentloans.gov
8177,house.gov
8228,ct.gov
8248,cbp.gov
8283,oregon.gov
8623,cms.gov
8677,wi.gov
9049,dot.gov
9239,tn.gov
9296,nist.gov
9302,eftps.gov
9331,colorado.gov
9571,sba.gov
9574,opm.gov
9575,senate.gov
9780,medicare.gov
9849,ftc.gov
9911,faa.gov
10636,ky.gov
10735,archives.gov
11013,dol.gov
11068,cia.gov
11357,georgia.gov
11472,cancer.gov
11594,fcc.gov
11682,tsa.gov
11714,hud.gov
11882,recreation.gov
12508,fbi.gov
12510,energy.gov
12788,justice.gov
13295,kingcounty.gov
13321,dc.gov
13613,lacounty.gov
14211,seattle.gov
14584,gsa.gov
14644,sc.gov
14645,osha.gov
14913,tsp.gov
15103,usps.gov
15496,fema.gov
15611,clinicaltrials.gov
15865,fairfaxcounty.gov
16141,alaska.gov
16480,treasury.gov
16911,wisconsin.gov
17344,eia.gov
17567,nc.gov
17577,flhsmv.gov
17759,nsf.gov
17922,idaho.gov
18219,maricopa.gov
18472,mt.gov
18719,mymedicare.gov
18825,ok.gov
Share this recording
Link
Append ?t=30
to start the playback at 30s, ?t=3:20
to start the playback at 3m 20s.
Embed image link
Use snippets below to display a screenshot linking to this recording.
Useful in places where scripts are not allowed (e.g. in a project's README file).
HTML:
Markdown:
Embed the player
If you're embedding on your own page or on a site which permits script tags, you can use the full player widget:
Paste the above script tag where you want the player to be displayed on your page.
See embedding docs for additional options.
Download this recording
You can download this recording in asciicast v1 format, as a .json file.
DownloadReplay in terminal
You can replay the downloaded recording in your terminal using the
asciinema play
command:
asciinema play 41398.json
If you don't have asciinema CLI installed then see installation instructions.
Use with stand-alone player on your website
Download asciinema player from
the releases page
(you only need .js
and .css
file), then use it like this:
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" type="text/css" href="asciinema-player.css" />
</head>
<body>
<div id="player"></div>
<script src="asciinema-player.min.js"></script>
<script>
AsciinemaPlayer.create(
'/assets/41398.json',
document.getElementById('player'),
{ cols: 80, rows: 24 }
);
</script>
</body>
</html>
See asciinema player quick-start guide for full usage instructions.
Generate GIF from this recording
While this site doesn't provide GIF conversion at the moment, you can still do it yourself with the help of asciinema GIF generator utility - agg.
Once you have it installed, generate a GIF with the following command:
agg https://asciinema.org/a/41398 demo.gif
Or, if you already downloaded the recording file:
agg demo.cast demo.gif
Check agg --help
for all available options. You can change font
family and size, select color theme, adjust speed and more.
See agg manual for full usage instructions.