56 Commits

Author SHA1 Message Date
David Yip
a63dff0d5b dashboard: JSLint happiness, one === at a time. 2015-07-17 23:13:51 +00:00
David Yip
773f5887e7 dashboard: Don't show blank notes. 2015-07-17 23:13:51 +00:00
David Yip
1766d9f653 dashboard: Remove extraneous space when note not present. 2015-07-17 23:13:51 +00:00
David Yip
54dee8c8c4 dashboard: Show short notes to justify jobs.
We're starting to have jobs that run to many (hundreds) of gigabytes,
and the question "why are we archiving this?" is starting to occur more
often in IRC.  Showing notes on the dashboard might be a good way to
quickly communicate rationale.
2015-07-17 23:13:51 +00:00
Ivan Kozik
e0f8bde356 Shorten several things in the status lines 2015-07-17 23:13:51 +00:00
Ivan Kozik
b3d84e505c Add ?host= URL param for testing dashboard.html locally 2015-07-17 23:13:51 +00:00
David Yip
c1f12411ca Show remaining queue length in dashboard. #96.
Also show total number of queued and downloaded items as a title
attribute.
2015-07-17 23:13:51 +00:00
Ivan Kozik
de4ea44e9c Link to the new docs 2015-07-17 23:13:51 +00:00
Ivan Kozik
b261297c1d Remove link to dashboard2 2015-07-17 23:13:51 +00:00
Ivan Kozik
ddafec8c27 Fix ws:// and /logs/recent URLs 2015-07-17 23:13:51 +00:00
Ivan Kozik
fa4f193af9 Import the first revision of dashboard 2.0 in the ArchiveBot repo 2015-07-17 23:13:41 +00:00
Ivan Kozik
5229ddf5dc Start work on websocket server for future dashboard integration 2015-07-17 22:42:25 +00:00
Ivan Kozik
03d1efc2ce Clarify argument order requirement 2015-07-17 03:59:42 +00:00
Ivan Kozik
62cba3a0e7 Update UA 2015-05-19 20:52:29 +00:00
Ivan Kozik
2d9b1395f1 Put the date into the DIR name and WARC name 2015-05-19 18:13:03 +00:00
Ivan Kozik
66d22b1556 Update UA 2015-04-08 20:42:10 +00:00
Ivan Kozik
c83c89b0cf Remove --no-skip-getaddrinfo to match ArchiveBot 2015-04-07 07:26:54 +00:00
Ivan Kozik
2983615d50 Send Accept-Language to avoid 500 Internal Server Error when sending Firefox UA to reddit.com 2015-03-30 03:07:23 +00:00
Ivan Kozik
8bf1b00c46 Copy in latest dupespotter 2015-03-14 20:36:16 +00:00
Ivan Kozik
040afe0d92 Copy in latest dupespotter 2015-03-14 19:35:55 +00:00
Ivan Kozik
9fbe0bdb6f Copy in latest dupespotter 2015-03-11 22:54:35 +00:00
Ivan Kozik
d14d8135e6 Copy in latest dupespotter 2015-03-11 20:22:58 +00:00
Ivan Kozik
c140f1caf0 Copy in latest dupespotter 2015-03-09 07:35:55 +00:00
Ivan Kozik
b785e9b1b7 Pause the crawl when running low on disk or memory 2015-03-09 06:02:24 +00:00
Ivan Kozik
cbaccd9e02 Avoid creating directories with ? or & in the filename, which breaks
sqlalchemy when it tries to parse arguments from the filename.

Fixes https://github.com/ludios/grab-site/issues/1
2015-03-09 05:16:54 +00:00
Ivan Kozik
f80df6944f Describe arguments more 2015-03-09 05:06:44 +00:00
Ivan Kozik
611a0be845 Cleanup 2015-03-09 04:53:38 +00:00
Ivan Kozik
820e2aeef4 Mention WARC files; clarify 2015-03-09 04:52:18 +00:00
Ivan Kozik
a1cbcb9ea9 Describe what this is 2015-03-09 04:48:27 +00:00
Ivan Kozik
3fe4774c2c Copy in latest dupespotter 2015-03-04 04:37:39 +00:00
Ivan Kozik
1f9f80dff0 Copy in latest dupespotter 2015-03-04 04:23:06 +00:00
Ivan Kozik
fbbfa3c0b4 Copy in latest dupespotter 2015-03-01 23:43:53 +00:00
Ivan Kozik
5b2b68061d Copy in latest dupespotter 2015-02-24 05:03:45 +00:00
Ivan Kozik
85f02d2055 Include path/query components in directory name 2015-02-23 03:03:15 +00:00
Ivan Kozik
62866f5336 Copy in latest dupespotter 2015-02-17 01:58:56 +00:00
Ivan Kozik
ccaee25497 Link to global ignore set 2015-02-05 19:32:47 +00:00
Ivan Kozik
e2118bbea4 Clarify 2015-02-05 19:31:50 +00:00
Ivan Kozik
4a22b4d593 Tell user to install git as well 2015-02-05 19:27:19 +00:00
Ivan Kozik
65e096a035 Support --ignore-sets= instead of the space-separated version 2015-02-05 06:05:54 +00:00
Ivan Kozik
2d7125951f Link to pythex 2015-02-05 05:39:44 +00:00
Ivan Kozik
f815920a83 Document file formats 2015-02-05 05:37:34 +00:00
Ivan Kozik
d73ee5ba27 Make it real obvious 2015-02-05 05:34:49 +00:00
Ivan Kozik
2f7ae834bb Add ArchiveBot LICENSE 2015-02-05 05:22:18 +00:00
Ivan Kozik
0699689a14 Add igoff feature 2015-02-05 05:19:34 +00:00
Ivan Kozik
2ccb8b4d6f Add support for --no-offsite-links 2015-02-05 05:15:46 +00:00
Ivan Kozik
52d0acc3b5 Another html5lib comment 2015-02-05 05:08:03 +00:00
Ivan Kozik
64d027da2c Fix comment 2015-02-05 05:07:05 +00:00
Ivan Kozik
6f8ef82efb Rename script 2015-02-05 05:05:53 +00:00
Ivan Kozik
979b843458 Load changes from DIR/ignores and DIR/ignore_sets while the crawl is running 2015-02-05 04:59:28 +00:00
Ivan Kozik
5f7593fda2 Refactor 2015-02-05 04:39:52 +00:00