humungus - miniwebproxy

i am gravely disappointed

there has been too much violence

overview - files - changes

118:30becd2c8110 on 2022-04-04 17:40:22 -0400 EDT by Ted Unangst <tedu@tedunangst.com>
Tagged: tip

another day, another lazy loader

117:05a3b5e6d62f on 2022-04-01 13:03:54 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

no comments

116:031b9f8b56f5 on 2022-03-26 01:46:55 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

save this so we don't lose it

115:704db8257555 on 2022-03-25 23:45:03 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

better filters

114:82c204fd1e56 on 2022-03-06 01:13:11 -0500 EST by Ted Unangst <tedu@tedunangst.com>

almost in place

113:3ee62c9c7289 on 2022-03-01 01:59:36 -0500 EST by Ted Unangst <tedu@tedunangst.com>

get it going

112:d1c93cd16201 on 2022-03-01 01:34:44 -0500 EST by Ted Unangst <tedu@tedunangst.com>

start reworking to use yaegi

111:ecf89679307a on 2019-08-21 19:32:45 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

just hostname here, minus port, should make things easier to work with

110:3a518d8c5476 on 2019-08-13 13:31:08 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

rework filtering to provide some more features. try to document some of it even.

109:13d24fec8950 on 2019-08-13 12:23:35 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

simply close code a bit

108:42bdd9ba3918 on 2019-05-05 20:08:11 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

update permitted tags

107:eb4595c0222d on 2019-04-20 22:37:25 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

go.mod build

106:4fce3676492f on 2019-04-20 22:31:34 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

filter to chew on toots as well

105:7238d11a3008 on 2019-03-29 06:38:49 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

the main tag may be just the tag we're searching for

104:aa7305dc13b4 on 2019-03-29 04:49:44 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

allow id attribute, so #urls work

103:97e428504e7b on 2019-03-17 21:40:28 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

repair accidental commit to filter.lua

102:716f74ab310e on 2019-03-16 20:02:58 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

connect to port 80 is not tls. also add back a bit of logging.

101:729c973f7982 on 2019-03-16 17:13:19 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.9 for changeset c9662fb3d18c

100:c9662fb3d18c on 2019-03-07 00:54:44 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.9

attempt to deal with websockets. maybe it works.

99:c628a41412f8 on 2019-03-06 23:12:29 -0500 EST by Ted Unangst <tedu@tedunangst.com>

forgot, easier to just peek than read/unread

98:cbf01ee074fd on 2019-03-06 21:45:07 -0500 EST by Ted Unangst <tedu@tedunangst.com>

use a semi custom transport for roundtrip

97:474533847da0 on 2019-03-06 21:34:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

simply code a bit. notably, http.transport does a better job handling
idle connections. this seems faster and more reliable.

96:7da945c3e9a8 on 2019-03-06 18:44:35 -0500 EST by Ted Unangst <tedu@tedunangst.com>

keepalive means we need better timeout handling

95:0a0c57454caf on 2019-03-06 02:50:25 -0500 EST by Ted Unangst <tedu@tedunangst.com>

keep the connection open (for https).
reduce useless logging.

94:8eb249211290 on 2019-02-17 18:46:21 -0500 EST by Ted Unangst <tedu@tedunangst.com>

slip the command line filter into the main exec

93:cbd4f6489b34 on 2019-02-16 03:58:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

replace some silly writestring(sprintf) calls with fprintf

92:e019d060bcbd on 2019-02-14 16:41:58 -0500 EST by Ted Unangst <tedu@tedunangst.com>

include some more information in the error message sent to browser

91:0645f631c15b on 2019-02-14 00:52:12 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.8 for changeset ce93d20d49b5

90:ce93d20d49b5 on 2019-02-14 00:51:27 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.8

update readme to latest style

89:f3ebcf339db4 on 2019-02-08 19:39:32 -0500 EST by Ted Unangst <tedu@tedunangst.com>

a tweak here and there to the filter rules

88:a9dffb52f2bb on 2019-02-06 18:07:50 -0500 EST by Ted Unangst <tedu@tedunangst.com>

copyright

87:d9bf148b8b53 on 2019-02-06 14:11:31 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.7 for changeset 185ab726a7c3

86:185ab726a7c3 on 2019-02-06 14:09:14 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.7

silly to keep release.sh here

85:1ca3f54c087d on 2019-02-06 00:04:12 -0500 EST by Ted Unangst <tedu@tedunangst.com>

sometimes content is in .content

84:4e4586688580 on 2019-02-05 23:56:06 -0500 EST by Ted Unangst <tedu@tedunangst.com>

a bit more precision about head vs body

83:617bcd628e81 on 2019-02-05 23:41:58 -0500 EST by Ted Unangst <tedu@tedunangst.com>

allow link tags such as found in head

82:c37f354d51a0 on 2019-02-05 23:29:01 -0500 EST by Ted Unangst <tedu@tedunangst.com>

reformat some code

81:4889092776c9 on 2019-02-05 23:22:01 -0500 EST by Ted Unangst <tedu@tedunangst.com>

chunked transfer encoding

80:ec4aa2a53c7b on 2019-02-05 23:13:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

in case we're not gzipping, use a bufio

79:c1ca1af16325 on 2019-02-05 22:18:36 -0500 EST by Ted Unangst <tedu@tedunangst.com>

possibly faster to escape and write in one pass

78:d27154536eb3 on 2019-02-05 20:18:34 -0500 EST by Ted Unangst <tedu@tedunangst.com>

allow tables to have rowspan and colspan

77:e8c947e81753 on 2019-02-05 20:13:38 -0500 EST by Ted Unangst <tedu@tedunangst.com>

include links to embedded youtube videos instead of hiding

76:441879680eb7 on 2019-02-05 13:41:19 -0500 EST by Ted Unangst <tedu@tedunangst.com>

the html we write will always be utf-8. prevent encoding mismatches.

75:60a881a55615 on 2019-01-31 00:30:53 -0500 EST by Ted Unangst <tedu@tedunangst.com>

a function to save unfiltered url for debugging troublesome sites

74:a7a3d5f01a1d on 2019-01-29 19:16:12 -0500 EST by Ted Unangst <tedu@tedunangst.com>

still need to send 200 when not intercepting

73:19d90b1a0aae on 2019-01-29 16:36:31 -0500 EST by Ted Unangst <tedu@tedunangst.com>

move outbound connection up earlier so we can report connect errors

72:f28b9ea6f7f6 on 2019-01-26 03:13:25 -0500 EST by Ted Unangst <tedu@tedunangst.com>

tweak the locking to be a bit more efficient

71:54e059527372 on 2019-01-26 03:06:21 -0500 EST by Ted Unangst <tedu@tedunangst.com>

rename some variables to make things easier to follow

70:47ea1a33bf28 on 2019-01-25 21:39:35 -0500 EST by Ted Unangst <tedu@tedunangst.com>

start using context and deadlines. incomplete.

69:9b564e604721 on 2019-01-25 16:18:01 -0500 EST by Ted Unangst <tedu@tedunangst.com>

openbsd defaults to 2048 bit key, but make it explicit

68:533ff0a1b979 on 2019-01-25 16:16:58 -0500 EST by Ted Unangst <tedu@tedunangst.com>

clean up the cert cache expiry code a bit

67:0b7b20e95709 on 2019-01-25 04:50:26 -0500 EST by Ted Unangst <tedu@tedunangst.com>

stop using responsewriter after hijacking

66:39a32f662a75 on 2019-01-25 03:19:39 -0500 EST by Ted Unangst <tedu@tedunangst.com>

move a function

65:86af52ab5f26 on 2019-01-25 02:06:58 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.6 for changeset dba7d89250ad

64:dba7d89250ad on 2019-01-25 02:06:17 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.6

oops, long standing bug in luainterface. forgot to pop return.
broke filtering after shared interpreters change.

63:164837d976ce on 2019-01-24 06:22:03 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.5 for changeset 43cd9b74804d

62:43cd9b74804d on 2019-01-24 06:07:26 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.5

one lua interpreter per request should be enough

61:337150fb25e5 on 2019-01-24 06:00:54 -0500 EST by Ted Unangst <tedu@tedunangst.com>

rework the filtering pass to avoid all the temp buffers.
should be bit leaner and faster now.

60:82a534fd2cef on 2019-01-22 07:43:09 -0500 EST by Ted Unangst <tedu@tedunangst.com>

fixes and tweaks to filtering,
mostly to get busted blogspot pages working

59:8e8de1b52985 on 2019-01-22 00:00:19 -0500 EST by Ted Unangst <tedu@tedunangst.com>

govendor does a better job of vendoring

58:e8b8be4268dd on 2019-01-20 17:59:44 -0500 EST by Ted Unangst <tedu@tedunangst.com>

some sites use brotli, which isn't decoded. for now, hack around this
by forcing gzip accept-encoding instead of adding dependencies.

57:cdf3b25fe8ba on 2019-01-20 17:50:09 -0500 EST by Ted Unangst <tedu@tedunangst.com>

add a few more rules to filter.lua

56:b37b3156fb8f on 2019-01-20 17:49:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

tweak rules to be simpler

55:b584c78fa4cd on 2019-01-17 21:06:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

add a command line filter for testing and experimentation

54:abf64af5b136 on 2019-01-16 06:52:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.4 for changeset 360e7f70b8b7

53:360e7f70b8b7 on 2019-01-16 06:52:07 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.4

clean up release dir a little

52:fb53e7298eb7 on 2019-01-16 06:50:08 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.3 for changeset 14024fd1ccf8

51:14024fd1ccf8 on 2019-01-16 06:49:47 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.3

always name output miniwebproxy

50:020731045977 on 2019-01-16 06:45:08 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.2 for changeset fb5ca99cad8d

49:fb5ca99cad8d on 2019-01-16 06:45:00 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.2

rename resign.sh to just sign.sh

48:7e035cab1f46 on 2019-01-16 06:43:56 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.1 for changeset ea3aa57467fb

47:ea3aa57467fb on 2019-01-16 06:42:50 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.1

release machinery

46:4eb8ad0180e5 on 2019-01-16 06:21:16 -0500 EST by Ted Unangst <tedu@tedunangst.com>

Added tag v0.9.0 for changeset 5d050db048ae

45:5d050db048ae on 2019-01-16 06:09:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Tagged: v0.9.0

improve makefile

44:5e23f47c109c on 2019-01-16 02:37:12 -0500 EST by Ted Unangst <tedu@tedunangst.com>

update some notes

43:1222668343bc on 2019-01-16 02:33:40 -0500 EST by Ted Unangst <tedu@tedunangst.com>

not used here, but support more types in the luainterface reflector

42:1133d3a09316 on 2019-01-15 08:45:57 -0500 EST by Ted Unangst <tedu@tedunangst.com>

do not pass bodybytes to lua. it's huge and inefficient.

41:ccf7b75479bd on 2019-01-15 07:11:29 -0500 EST by Ted Unangst <tedu@tedunangst.com>

allow the proxy to use a proxy, such as tor.

40:9878222b5971 on 2019-01-11 05:45:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

resort imports

39:23083bb802e9 on 2019-01-11 05:21:15 -0500 EST by Ted Unangst <tedu@tedunangst.com>

go fmt

38:7b58cce48648 on 2019-01-11 05:17:31 -0500 EST by Ted Unangst <tedu@tedunangst.com>

filtering on github and amazon is usually bad

37:c3b7a3eacc8e on 2019-01-11 05:17:02 -0500 EST by Ted Unangst <tedu@tedunangst.com>

remove annoying zipr div

36:d56a40b6a232 on 2019-01-11 05:16:41 -0500 EST by Ted Unangst <tedu@tedunangst.com>

proxy is supposed to do connection: close

35:b0e80c5c6872 on 2019-01-01 17:41:38 -0500 EST by Ted Unangst <tedu@tedunangst.com>

sup

34:1a322662576d on 2018-12-31 22:32:42 -0500 EST by Ted Unangst <tedu@tedunangst.com>

make sure outbound dial has a port

33:af463b194af6 on 2018-12-31 22:20:53 -0500 EST by Ted Unangst <tedu@tedunangst.com>

more bugs fixed by simply reusing request in http case

32:6f156798a685 on 2018-12-31 14:59:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

h4-h6 are ok tags

31:909ab243d0e2 on 2018-12-31 01:48:02 -0500 EST by Ted Unangst <tedu@tedunangst.com>

use io.WriteString instead of []byte casts

30:437c340fa0f0 on 2018-12-31 01:19:33 -0500 EST by Ted Unangst <tedu@tedunangst.com>

copy all the headers for http proxy. should just write it back.

29:1a56300de514 on 2018-12-29 13:53:17 -0500 EST by Ted Unangst <tedu@tedunangst.com>

stupid times has articles that inconsistently put img in figure
but sometimes leave it out.

28:6934120c56dc on 2018-12-28 17:57:12 -0500 EST by Ted Unangst <tedu@tedunangst.com>

add a simple signing script

27:a7593cc2cef9 on 2018-12-27 16:55:07 -0500 EST by Ted Unangst <tedu@tedunangst.com>

some hosts do not like seeing a port number in host. strip it.

26:fcb2d7177cf6 on 2018-12-25 23:50:30 -0500 EST by Ted Unangst <tedu@tedunangst.com>

need to copy cookies for plain http.
probably need to do more, but this moves stuff along.

25:b8dc83d2e08d on 2018-12-24 17:55:02 -0500 EST by Ted Unangst <tedu@tedunangst.com>

seems we should care about errors creating certs

24:457709242321 on 2018-12-24 15:02:02 -0500 EST by Ted Unangst <tedu@tedunangst.com>

allow to force reddit to old.reddit.com

23:33295d890e66 on 2018-12-24 13:49:33 -0500 EST by Ted Unangst <tedu@tedunangst.com>

can use init function to sort arrays

22:f9cdde536621 on 2018-12-24 13:25:49 -0500 EST by Ted Unangst <tedu@tedunangst.com>

refine permitted tags a big. more table stuff

21:2cea426d5fd7 on 2018-12-06 00:11:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

link to noscript parse bug issue

20:90b8971637bc on 2018-12-06 00:06:00 -0500 EST by Ted Unangst <tedu@tedunangst.com>

s is an allowed tag

19:5e8f62f4c390 on 2018-12-06 00:05:24 -0500 EST by Ted Unangst <tedu@tedunangst.com>

remove obsolete filtering code. it lives in lua now.

18:d193d249ae19 on 2018-11-21 18:30:50 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Parent: 16:d91b9f1883de

m

17:4e99fa132268 on 2018-11-21 18:29:56 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Parent: 9:1732887fe591

belated status update

16:d91b9f1883de on 2018-01-13 11:03:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

repair some image urls that hide in data attrs

15:1164da4ee688 on 2018-01-13 05:25:38 -0500 EST by Ted Unangst <tedu@tedunangst.com>

allow skipping http too

14:fd7f751bf2f6 on 2018-01-12 12:02:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

don't print inside of script tag

13:59cdfd81a6aa on 2018-01-12 08:47:54 -0500 EST by Ted Unangst <tedu@tedunangst.com>

sometimes the gist shit isn't in an iframe, but just a script

12:df5863a71ea6 on 2018-01-12 08:47:02 -0500 EST by Ted Unangst <tedu@tedunangst.com>

a few more selectors to find an article

11:05442c988748 on 2018-01-12 08:36:34 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Parent: 9:1732887fe591

m

10:91e663507a32 on 2018-01-12 08:35:50 -0500 EST by Ted Unangst <tedu@tedunangst.com>
Parent: 4:0e7d2596575e

move shouldfilter logic into lua

9:1732887fe591 on 2017-12-12 01:32:37 -0500 EST by Ted Unangst <tedu@tedunangst.com>

quick fix for multiple articles; don't rewrite yet

8:b8ef96dd7c2f on 2017-12-12 01:29:05 -0500 EST by Ted Unangst <tedu@tedunangst.com>

small fix to grab more header info

7:522c943acca4 on 2017-12-12 00:51:18 -0500 EST by Ted Unangst <tedu@tedunangst.com>

some sites like using picture. try to make it an img

6:1c2d98f8189a on 2017-12-11 23:08:49 -0500 EST by Ted Unangst <tedu@tedunangst.com>

much more aggressive about article filtering

5:8a9ba430d277 on 2017-12-11 23:08:28 -0500 EST by Ted Unangst <tedu@tedunangst.com>

quick fix to get filtering for plain http as well

4:0e7d2596575e on 2017-12-11 20:31:55 -0500 EST by Ted Unangst <tedu@tedunangst.com>

make lua filters a bit more useful

3:a7820b23fba8 on 2017-12-09 21:24:51 -0500 EST by Ted Unangst <tedu@tedunangst.com>

start moving the filtering logic into a lua script

2:5e78fe97de58 on 2017-10-31 15:59:45 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

a case study in regridding homepages, with weird json special sauce

1:001533479baf on 2017-10-25 16:11:32 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

parse nearly all html responses.
allows filtering by indicators other than url.

0:df53184ac9cd on 2017-10-23 23:13:13 -0400 EDT by Ted Unangst <tedu@tedunangst.com>

miniwebproxy