piss

entries

  1. Mini NES Electronics Kit Status Update
    daftmike's blog 1970-08-29T13:08:00+00:00
  2. Mini NES Electronics Kit Status Update #2
    daftmike's blog 1970-09-13T14:24:00+00:00
  3. Shipping delay update :(
    daftmike's blog 1970-10-30T12:14:00+00:00
  4. Api
    Neovim 2001-01-01T00:00:00+00:00
  5. Api-ui-events
    Neovim 2001-01-01T00:00:00+00:00
  6. Autocmd
    Neovim 2001-01-01T00:00:00+00:00
  7. Backers
    Neovim 2001-01-01T00:00:00+00:00
  8. Change
    Neovim 2001-01-01T00:00:00+00:00
  9. Channel
    Neovim 2001-01-01T00:00:00+00:00
  10. Cmdline
    Neovim 2001-01-01T00:00:00+00:00
  11. Credits
    Neovim 2001-01-01T00:00:00+00:00
  12. Debug
    Neovim 2001-01-01T00:00:00+00:00
  13. Deprecated
    Neovim 2001-01-01T00:00:00+00:00
  14. Dev
    Neovim 2001-01-01T00:00:00+00:00
  15. Dev_arch
    Neovim 2001-01-01T00:00:00+00:00
  16. Dev_style
    Neovim 2001-01-01T00:00:00+00:00
  17. Dev_test
    Neovim 2001-01-01T00:00:00+00:00
  18. Dev_theme
    Neovim 2001-01-01T00:00:00+00:00
  19. Dev_tools
    Neovim 2001-01-01T00:00:00+00:00
  20. Dev_vimpatch
    Neovim 2001-01-01T00:00:00+00:00
  21. Develop
    Neovim 2001-01-01T00:00:00+00:00
  22. Diagnostic
    Neovim 2001-01-01T00:00:00+00:00
  23. Diff
    Neovim 2001-01-01T00:00:00+00:00
  24. Digraph
    Neovim 2001-01-01T00:00:00+00:00
  25. Editing
    Neovim 2001-01-01T00:00:00+00:00
  26. Editorconfig
    Neovim 2001-01-01T00:00:00+00:00
  27. Faq
    Neovim 2001-01-01T00:00:00+00:00
  28. Filetype
    Neovim 2001-01-01T00:00:00+00:00
  29. Fold
    Neovim 2001-01-01T00:00:00+00:00
  30. Ft_ada
    Neovim 2001-01-01T00:00:00+00:00
  31. Ft_hare
    Neovim 2001-01-01T00:00:00+00:00
  32. Ft_ps1
    Neovim 2001-01-01T00:00:00+00:00
  33. Ft_raku
    Neovim 2001-01-01T00:00:00+00:00
  34. Ft_rust
    Neovim 2001-01-01T00:00:00+00:00
  35. Ft_sql
    Neovim 2001-01-01T00:00:00+00:00
  36. Gui
    Neovim 2001-01-01T00:00:00+00:00
  37. Health
    Neovim 2001-01-01T00:00:00+00:00
  38. Helphelp
    Neovim 2001-01-01T00:00:00+00:00
  39. If_perl
    Neovim 2001-01-01T00:00:00+00:00
  40. If_pyth
    Neovim 2001-01-01T00:00:00+00:00
  41. If_ruby
    Neovim 2001-01-01T00:00:00+00:00
  42. Indent
    Neovim 2001-01-01T00:00:00+00:00
  43. Index
    Neovim 2001-01-01T00:00:00+00:00
  44. Insert
    Neovim 2001-01-01T00:00:00+00:00
  45. Intro
    Neovim 2001-01-01T00:00:00+00:00
  46. Job_control
    Neovim 2001-01-01T00:00:00+00:00
  47. L10n-arabic
    Neovim 2001-01-01T00:00:00+00:00
  48. L10n-hebrew
    Neovim 2001-01-01T00:00:00+00:00
  49. L10n-russian
    Neovim 2001-01-01T00:00:00+00:00
  50. L10n-vietnamese
    Neovim 2001-01-01T00:00:00+00:00
  51. Lsp
    Neovim 2001-01-01T00:00:00+00:00
  52. Lua
    Neovim 2001-01-01T00:00:00+00:00
  53. Lua-bit
    Neovim 2001-01-01T00:00:00+00:00
  54. Lua-guide
    Neovim 2001-01-01T00:00:00+00:00
  55. Lua-plugin
    Neovim 2001-01-01T00:00:00+00:00
  56. Luaref
    Neovim 2001-01-01T00:00:00+00:00
  57. Luvref
    Neovim 2001-01-01T00:00:00+00:00
  58. Map
    Neovim 2001-01-01T00:00:00+00:00
  59. Mbyte
    Neovim 2001-01-01T00:00:00+00:00
  60. Message
    Neovim 2001-01-01T00:00:00+00:00
  61. Mlang
    Neovim 2001-01-01T00:00:00+00:00
  62. Motion
    Neovim 2001-01-01T00:00:00+00:00
  63. News
    Neovim 2001-01-01T00:00:00+00:00
  64. News-0.10
    Neovim 2001-01-01T00:00:00+00:00
  65. News-0.11
    Neovim 2001-01-01T00:00:00+00:00
  66. News-0.12
    Neovim 2001-01-01T00:00:00+00:00
  67. News-0.9
    Neovim 2001-01-01T00:00:00+00:00
  68. Nvim
    Neovim 2001-01-01T00:00:00+00:00
  69. Nvim_terminal_emulator
    Neovim 2001-01-01T00:00:00+00:00
  70. Options
    Neovim 2001-01-01T00:00:00+00:00
  71. Pack
    Neovim 2001-01-01T00:00:00+00:00
  72. Pattern
    Neovim 2001-01-01T00:00:00+00:00
  73. Pi_gzip
    Neovim 2001-01-01T00:00:00+00:00
  74. Pi_msgpack
    Neovim 2001-01-01T00:00:00+00:00
  75. Pi_paren
    Neovim 2001-01-01T00:00:00+00:00
  76. Pi_spec
    Neovim 2001-01-01T00:00:00+00:00
  77. Pi_tar
    Neovim 2001-01-01T00:00:00+00:00
  78. Pi_tutor
    Neovim 2001-01-01T00:00:00+00:00
  79. Pi_zip
    Neovim 2001-01-01T00:00:00+00:00
  80. Plugins
    Neovim 2001-01-01T00:00:00+00:00
  81. Provider
    Neovim 2001-01-01T00:00:00+00:00
  82. Quickfix
    Neovim 2001-01-01T00:00:00+00:00
  83. Quickref
    Neovim 2001-01-01T00:00:00+00:00
  84. Recover
    Neovim 2001-01-01T00:00:00+00:00
  85. Remote
    Neovim 2001-01-01T00:00:00+00:00
  86. Remote_plugin
    Neovim 2001-01-01T00:00:00+00:00
  87. Repeat
    Neovim 2001-01-01T00:00:00+00:00
  88. Rileft
    Neovim 2001-01-01T00:00:00+00:00
  89. Scroll
    Neovim 2001-01-01T00:00:00+00:00
  90. Sign
    Neovim 2001-01-01T00:00:00+00:00
  91. Spell
    Neovim 2001-01-01T00:00:00+00:00
  92. Starting
    Neovim 2001-01-01T00:00:00+00:00
  93. Support
    Neovim 2001-01-01T00:00:00+00:00
  94. Syntax
    Neovim 2001-01-01T00:00:00+00:00
  95. Tabpage
    Neovim 2001-01-01T00:00:00+00:00
  96. Tagsrch
    Neovim 2001-01-01T00:00:00+00:00
  97. Term
    Neovim 2001-01-01T00:00:00+00:00
  98. Terminal
    Neovim 2001-01-01T00:00:00+00:00
  99. Tips
    Neovim 2001-01-01T00:00:00+00:00
  100. Treesitter
    Neovim 2001-01-01T00:00:00+00:00
  101. Tui
    Neovim 2001-01-01T00:00:00+00:00
  102. Uganda
    Neovim 2001-01-01T00:00:00+00:00
  103. Ui
    Neovim 2001-01-01T00:00:00+00:00
  104. Undo
    Neovim 2001-01-01T00:00:00+00:00
  105. Userfunc
    Neovim 2001-01-01T00:00:00+00:00
  106. Usr_01
    Neovim 2001-01-01T00:00:00+00:00
  107. Usr_02
    Neovim 2001-01-01T00:00:00+00:00
  108. Usr_03
    Neovim 2001-01-01T00:00:00+00:00
  109. Usr_04
    Neovim 2001-01-01T00:00:00+00:00
  110. Usr_05
    Neovim 2001-01-01T00:00:00+00:00
  111. Usr_06
    Neovim 2001-01-01T00:00:00+00:00
  112. Usr_07
    Neovim 2001-01-01T00:00:00+00:00
  113. Usr_08
    Neovim 2001-01-01T00:00:00+00:00
  114. Usr_09
    Neovim 2001-01-01T00:00:00+00:00
  115. Usr_10
    Neovim 2001-01-01T00:00:00+00:00
  116. Usr_11
    Neovim 2001-01-01T00:00:00+00:00
  117. Usr_12
    Neovim 2001-01-01T00:00:00+00:00
  118. Usr_20
    Neovim 2001-01-01T00:00:00+00:00
  119. Usr_21
    Neovim 2001-01-01T00:00:00+00:00
  120. Usr_22
    Neovim 2001-01-01T00:00:00+00:00
  121. Usr_23
    Neovim 2001-01-01T00:00:00+00:00
  122. Usr_24
    Neovim 2001-01-01T00:00:00+00:00
  123. Usr_25
    Neovim 2001-01-01T00:00:00+00:00
  124. Usr_26
    Neovim 2001-01-01T00:00:00+00:00
  125. Usr_27
    Neovim 2001-01-01T00:00:00+00:00
  126. Usr_28
    Neovim 2001-01-01T00:00:00+00:00
  127. Usr_29
    Neovim 2001-01-01T00:00:00+00:00
  128. Usr_30
    Neovim 2001-01-01T00:00:00+00:00
  129. Usr_31
    Neovim 2001-01-01T00:00:00+00:00
  130. Usr_32
    Neovim 2001-01-01T00:00:00+00:00
  131. Usr_40
    Neovim 2001-01-01T00:00:00+00:00
  132. Usr_41
    Neovim 2001-01-01T00:00:00+00:00
  133. Usr_42
    Neovim 2001-01-01T00:00:00+00:00
  134. Usr_43
    Neovim 2001-01-01T00:00:00+00:00
  135. Usr_44
    Neovim 2001-01-01T00:00:00+00:00
  136. Usr_45
    Neovim 2001-01-01T00:00:00+00:00
  137. Usr_toc
    Neovim 2001-01-01T00:00:00+00:00
  138. Various
    Neovim 2001-01-01T00:00:00+00:00
  139. Vi_diff
    Neovim 2001-01-01T00:00:00+00:00
  140. Vim_diff
    Neovim 2001-01-01T00:00:00+00:00
  141. Vimeval
    Neovim 2001-01-01T00:00:00+00:00
  142. Vimfn
    Neovim 2001-01-01T00:00:00+00:00
  143. Visual
    Neovim 2001-01-01T00:00:00+00:00
  144. Vvars
    Neovim 2001-01-01T00:00:00+00:00
  145. Windows
    Neovim 2001-01-01T00:00:00+00:00
  146. About
    Neovim 2001-01-01T00:00:00+00:00
  147. Build
    Neovim 2001-01-01T00:00:00+00:00
  148. Helptag redirect
    Neovim 2001-01-01T00:00:00+00:00
  149. Install
    Neovim 2001-01-01T00:00:00+00:00
  150. News archive
    Neovim 2001-01-01T00:00:00+00:00
  151. Roadmap
    Neovim 2001-01-01T00:00:00+00:00
  152. Screenshots
    Neovim 2001-01-01T00:00:00+00:00
  153. Sponsors
    Neovim 2001-01-01T00:00:00+00:00
  154. Vision
    Neovim 2001-01-01T00:00:00+00:00
  155. Subspace / Continuum History
    Dan Luu 2006-02-01T00:00:00+00:00
  156. History of Symbolics lisp machines
    Dan Luu 2007-11-16T00:00:00+00:00
  157. Entourage + Applescript = Frustration
    Steve Losh 2008-02-21T15:25:45+00:00
  158. Work-life balance at Bioware
    Dan Luu 2008-05-31T00:00:00+00:00
  159. Site Redesign
    Steve Losh 2009-01-11T17:58:23+00:00
  160. Going Open Source
    Steve Losh 2009-01-13T20:08:56+00:00
  161. Deploying with Fabric & Mercurial
    Steve Losh 2009-01-15T20:51:09+00:00
  162. How & Why I DJ
    Steve Losh 2009-02-06T17:53:44+00:00
  163. BumpMapping hell
    Fabien Sanglard 2009-03-04T04:33:27+00:00
  164. Mercurial Bash Prompts
    Steve Losh 2009-03-17T21:34:55+00:00
  165. Candy Colored Terminal
    Steve Losh 2009-03-18T18:26:28+00:00
  166. Fluide
    Fabien Sanglard 2009-04-15T04:33:27+00:00
  167. Fluid v1.1 up and coming...
    Fabien Sanglard 2009-05-09T04:33:27+00:00
  168. Wolfenstein 3D for iPhone code review
    Fabien Sanglard 2009-05-09T04:33:27+00:00
  169. Fluid: 1,000,000 downloads !!
    Fabien Sanglard 2009-05-14T04:33:27+00:00
  170. What I Hate About Mercurial
    Steve Losh 2009-05-29T19:51:05+00:00
  171. How to Contribute to Mercurial
    Steve Losh 2009-06-01T20:09:44+00:00
  172. Fluid2 RELEASED ! Fluid 1 now at 3,000,000 downloads !!
    Fabien Sanglard 2009-06-09T04:33:27+00:00
  173. Fluid speed issues!
    Fabien Sanglard 2009-06-29T04:33:27+00:00
  174. A Guide to Branching in Mercurial
    Steve Losh 2009-08-30T20:27:12+00:00
  175. Armadillo Space T-shirt
    Fabien Sanglard 2009-10-14T04:33:27+00:00
  176. iPhone 3D engine programming part 1
    Fabien Sanglard 2009-10-19T04:33:27+00:00
  177. Apple iPhone Tech Talk 2009 tricks and treats
    Fabien Sanglard 2009-12-03T04:33:27+00:00
  178. Don't learn Assembly on Mac OS X
    Fabien Sanglard 2009-12-31T04:33:27+00:00
  179. Are closed social networks inevitable?
    Dan Luu 2010-01-01T00:00:00+00:00
  180. How does Boston compare to SV and what do MIT and Stanford have to do with it?
    Dan Luu 2010-01-01T00:00:00+00:00
  181. Doom engine 1993 code review
    Fabien Sanglard 2010-01-13T04:33:27+00:00
  182. Moving from Django to Hyde
    Steve Losh 2010-01-15T20:14:00+00:00
  183. The Real Difference Between Mercurial and Git
    Steve Losh 2010-01-20T21:56:00+00:00
  184. My Extravagant Zsh Prompt
    Steve Losh 2010-02-01T01:05:00+00:00
  185. Doom iPhone code review
    Fabien Sanglard 2010-02-01T04:33:27+00:00
  186. Momentary latching circuit
    daftmike's blog 2010-02-12T13:52:00+00:00
  187. Low-battery indicator circuit
    daftmike's blog 2010-02-18T18:50:00+00:00
  188. Mercurial Workflows: Branch As Needed
    Steve Losh 2010-02-28T14:00:00+00:00
  189. How to build a circuit on Veroboard
    daftmike's blog 2010-03-02T20:31:00+00:00
  190. PSOne screen led-mod
    daftmike's blog 2010-03-10T10:15:00+00:00
  191. My Darling Dreamcast
    daftmike's blog 2010-03-20T15:14:00+00:00
  192. It's not even the beginning of the end...
    daftmike's blog 2010-03-27T14:13:00+00:00
  193. My Backlit Dreamcast VMU
    daftmike's blog 2010-04-09T15:28:00+00:00
  194. A Faster Feed Apart
    Steve Losh 2010-04-30T22:55:00+00:00
  195. Mercurial Workflows: Stable & Default
    Steve Losh 2010-05-17T18:27:00+00:00
  196. Tracing the baseband
    Fabien Sanglard 2010-05-27T18:08:27+00:00
  197. Mercurial Workflows: Translation Branches
    Steve Losh 2010-06-11T08:15:00+00:00
  198. A Git User's Guide to Mercurial Queues
    Steve Losh 2010-08-10T21:00:00+00:00
  199. Coming Home to Vim
    Steve Losh 2010-09-20T18:15:00+00:00
  200. Wii Sensor Bar Projector
    daftmike's blog 2010-11-24T14:48:00+00:00
  201. All about the fillrate
    Fabien Sanglard 2010-12-11T21:36:45+00:00
  202. SHMUP Lite
    Fabien Sanglard 2010-12-19T21:36:45+00:00
  203. Oh yes I'm working on new stuff...
    Plogue R&D 2010-12-23T19:48:00+00:00
  204. To become a good C programmer
    Fabien Sanglard 2011-02-02T21:36:45+00:00
  205. patience
    Plogue R&D 2011-02-04T21:17:00+00:00
  206. Lamer Exterminator, or how a 22 year old malware can still piss you off.
    Plogue R&D 2011-02-19T20:03:00+00:00
  207. To generate 60fps videos on iOS
    Fabien Sanglard 2011-02-21T21:36:45+00:00
  208. dEngine Source Code Released
    Fabien Sanglard 2011-04-28T21:36:45+00:00
  209. The Great AdLib Fire ...
    Plogue R&D 2011-05-04T00:58:00+00:00
  210. Does ... not ... compute...
    Plogue R&D 2011-05-10T14:04:00+00:00
  211. Going Paper-Free for $220
    Steve Losh 2011-05-26T13:44:00+00:00
  212. The reluctant US SMS that didnt want to be japanese
    Plogue R&D 2011-06-07T01:46:00+00:00
  213. soldiering on.
    What Was Found 2011-06-09T20:33:17+00:00
  214. (SAFE) US SMS Japanese Mod in action
    Plogue R&D 2011-06-24T17:40:00+00:00
  215. Playing the revolution/Home Computer Invasion Documentaries in Trouble???
    Plogue R&D 2011-06-25T19:20:00+00:00
  216. Polygon Codec
    Fabien Sanglard 2011-06-26T21:36:45+00:00
  217. Esperanto
    What Was Found 2011-06-28T20:24:22+00:00
  218. SHMUP Source Code
    Fabien Sanglard 2011-06-30T07:36:45+00:00
  219. Django Advice
    Steve Losh 2011-06-30T08:30:00+00:00
  220. It’s the thought that doesn’t count
    What Was Found 2011-06-30T21:12:49+00:00
  221. Seriously?
    What Was Found 2011-07-07T21:42:12+00:00
  222. Concerning monsters
    What Was Found 2011-07-14T01:12:25+00:00
  223. Hacker Monthly publication
    Fabien Sanglard 2011-07-15T01:36:45+00:00
  224. Dear dude that sneaks into my room at night and leaves recorded stories
    What Was Found 2011-07-22T23:31:12+00:00
  225. For the Record (pun intended)
    What Was Found 2011-07-26T19:06:29+00:00
  226. Your next meal will taste great
    What Was Found 2011-07-27T22:23:00+00:00
  227. Analog TV Death toll
    Plogue R&D 2011-09-01T13:37:00+00:00
  228. Writing Vim Plugins
    Steve Losh 2011-09-06T09:13:00+00:00
  229. Solving Ghost in The Wire codes
    Fabien Sanglard 2011-09-08T01:36:45+00:00
  230. Solving Ghost in The Wire codes
    Fabien Sanglard 2011-09-11T01:08:45+00:00
  231. Quake 2 Source Code Review
    Fabien Sanglard 2011-09-20T01:08:45+00:00
  232. Building a Cube64... part 1
    daftmike's blog 2011-10-09T19:47:00+00:00
  233. Arcade Restoration - Week1: Acquisition
    Plogue R&D 2011-10-14T09:38:00+00:00
  234. My weapons shed and a 360 degree C4 minefield
    What Was Found 2011-10-20T16:55:47+00:00
  235. untitled
    What Was Found 2011-10-20T19:00:20+00:00
  236. JAMMA Space Invaders experiment.
    Plogue R&D 2011-11-03T16:58:00+00:00
  237. How to build Doom3 on Mac OS X with XCode
    Fabien Sanglard 2011-11-25T01:08:45+00:00
  238. Another World Code Review
    Fabien Sanglard 2011-11-27T01:08:45+00:00
  239. Progressive playback: An atom story
    Fabien Sanglard 2011-11-27T01:08:45+00:00
  240. AY8930 sourced!
    Plogue R&D 2011-12-20T16:34:00+00:00
  241. AY8930 Initial tests!
    Plogue R&D 2011-12-22T19:54:00+00:00
  242. About this dev blog
    Evennia Devblog RSS Feed 2012-02-05T00:00:00+00:00
  243. Evennia's open bottlenecks
    Evennia Devblog RSS Feed 2012-02-05T00:00:00+00:00
  244. New Scope
    Plogue R&D 2012-02-08T15:26:00+00:00
  245. Such a small thing ...
    Evennia Devblog RSS Feed 2012-02-15T00:00:00+00:00
  246. Commands and you
    Evennia Devblog RSS Feed 2012-02-17T00:00:00+00:00
  247. Tutorial MUD, part 1: Environment setup
    TutorialMUD - pileborg.se 2012-02-18T18:28:12+00:00
  248. Tutorial MUD, part 1.5: Makefile dependencies
    TutorialMUD - pileborg.se 2012-02-21T20:47:35+00:00
  249. Dummies doing dummy things
    Evennia Devblog RSS Feed 2012-02-22T00:00:00+00:00
  250. Tutorial MUD, part 2: Logging
    TutorialMUD - pileborg.se 2012-02-22T19:19:58+00:00
  251. Android Shmup
    Fabien Sanglard 2012-02-23T01:08:45+00:00
  252. Tutorial MUD, part 3: Argument parsing
    TutorialMUD - pileborg.se 2012-03-04T10:33:41+00:00
  253. Tutorial MUD, part 4: Mainloop and signals
    TutorialMUD - pileborg.se 2012-03-10T15:38:52+00:00
  254. SSD reboot your thinking
    Fabien Sanglard 2012-03-17T01:08:45+00:00
  255. Jonathan Shapiro's Retrospective Thoughts on BitC
    Dan Luu 2012-03-23T00:00:00+00:00
  256. Tutorial MUD, part 5: Networking, part 1
    TutorialMUD - pileborg.se 2012-03-24T08:57:35+00:00
  257. Shortcuts to goodness
    Evennia Devblog RSS Feed 2012-03-26T00:00:00+00:00
  258. Be A Donor
    Fabien Sanglard 2012-04-22T01:08:45+00:00
  259. TutorialMUD hiatus
    TutorialMUD - pileborg.se 2012-04-22T12:49:04+00:00
  260. Volatile Software
    Steve Losh 2012-04-23T14:00:00+00:00
  261. Why Go?
    Nathan Youngman 2012-05-07T00:00:00+00:00
  262. Cracking Kevin Mitnick's Ghost In Tthe Wires Paperback Edition
    Fabien Sanglard 2012-05-09T01:08:45+00:00
  263. Address Sniffing an EPROM
    Plogue R&D 2012-05-21T19:44:00+00:00
  264. Dummies doing (even more) dummy things
    Evennia Devblog RSS Feed 2012-05-30T00:00:00+00:00
  265. Doom3 Source Code Review
    Fabien Sanglard 2012-06-08T01:08:45+00:00
  266. Coding from the inside
    Evennia Devblog RSS Feed 2012-06-11T00:00:00+00:00
  267. Extending time and details
    Evennia Devblog RSS Feed 2012-06-26T00:00:00+00:00
  268. Oculus RIFT development
    Fabien Sanglard 2012-06-30T01:08:45+00:00
  269. Quake 3Source Code Review
    Fabien Sanglard 2012-06-30T01:08:45+00:00
  270. The Caves of Clojure: Part 1
    Steve Losh 2012-07-07T17:00:00+00:00
  271. The Caves of Clojure: Part 2
    Steve Losh 2012-07-08T09:26:00+00:00
  272. The Caves of Clojure: Part 3.1
    Steve Losh 2012-07-09T09:37:00+00:00
  273. The Caves of Clojure: Part 3.2
    Steve Losh 2012-07-10T10:04:00+00:00
  274. The Caves of Clojure: Part 3.3
    Steve Losh 2012-07-11T09:25:00+00:00
  275. The Caves of Clojure: Part 3.4
    Steve Losh 2012-07-11T12:02:00+00:00
  276. The Caves of Clojure: Part 4
    Steve Losh 2012-07-12T09:42:00+00:00
  277. The Caves of Clojure: Part 5
    Steve Losh 2012-07-13T10:55:00+00:00
  278. The Caves of Clojure: Interlude 1
    Steve Losh 2012-07-14T17:06:00+00:00
  279. NES eprom carts
    Plogue R&D 2012-07-25T13:43:00+00:00
  280. The Caves of Clojure: Part 6
    Steve Losh 2012-07-30T09:50:00+00:00
  281. Namcot163 Dual 27C020 Eprom Cart
    Plogue R&D 2012-08-06T20:38:00+00:00
  282. Taking command
    Evennia Devblog RSS Feed 2012-08-16T00:00:00+00:00
  283. Combining Twisted and Django
    Evennia Devblog RSS Feed 2012-08-31T00:00:00+00:00
  284. Galaxian's digital oscillator explained.
    Plogue R&D 2012-09-08T20:41:00+00:00
  285. The Homely Mutt
    Steve Losh 2012-10-01T10:30:00+00:00
  286. A Modern Space Cadet
    Steve Losh 2012-10-03T09:55:00+00:00
  287. Community interest
    Evennia Devblog RSS Feed 2012-10-05T00:00:00+00:00
  288. The Caves of Clojure: Part 7.1
    Steve Losh 2012-10-15T09:50:00+00:00
  289. Evennia changes to BSD license
    Evennia Devblog RSS Feed 2012-10-28T00:00:00+00:00
  290. The future of TutorialMUD
    TutorialMUD - pileborg.se 2012-11-10T05:35:00+00:00
  291. on the banks of the O-rontes
    Kooneiform 2012-12-07T06:45:43+00:00
  292. Game timers: Issues and solutions
    Fabien Sanglard 2012-12-25T01:08:45+00:00
  293. 📕 Reviewing Practical Object-Oriented Design
    Nathan Youngman 2013-01-10T00:00:00+00:00
  294. Go Object Oriented Design
    Nathan Youngman 2013-01-14T00:00:00+00:00
  295. Duke Nukem 3D Code Review
    Fabien Sanglard 2013-01-17T01:08:45+00:00
  296. The best Tech books
    Fabien Sanglard 2013-01-17T01:08:45+00:00
  297. Soldering '80
    Plogue R&D 2013-01-20T02:10:00+00:00
  298. Fallout 3 – Edges
    Simonschreibt. 2013-01-20T23:12:00+00:00
  299. Teleglitch – Viewcones
    Simonschreibt. 2013-01-21T19:49:00+00:00
  300. Teleglitch – RGB Flickering
    Simonschreibt. 2013-01-21T19:53:00+00:00
  301. Diablo 3 – Trees
    Simonschreibt. 2013-01-21T19:55:00+00:00
  302. Warcraft 3 – Billboards
    Simonschreibt. 2013-01-21T19:58:00+00:00
  303. Divine Divinity – 2D Reflexion
    Simonschreibt. 2013-01-21T20:00:00+00:00
  304. Cel Shading
    Simonschreibt. 2013-01-21T20:01:00+00:00
  305. Deus Ex – Occlusion
    Simonschreibt. 2013-01-21T20:03:00+00:00
  306. Reverse Engineer Strike Commander
    Fabien Sanglard 2013-01-22T01:08:45+00:00
  307. Deus Ex 3 – Folds
    Simonschreibt. 2013-01-22T21:03:00+00:00
  308. Deus Ex – Scanlines
    Simonschreibt. 2013-01-22T22:38:00+00:00
  309. World of Warcraft – Balloon
    Simonschreibt. 2013-01-23T22:38:00+00:00
  310. Assassins Creed 3 – Windows
    Simonschreibt. 2013-01-24T22:27:00+00:00
  311. Good advice
    Kooneiform 2013-01-25T19:38:16+00:00
  312. Assassins Creed 3 – LoD Blending
    Simonschreibt. 2013-01-27T22:40:00+00:00
  313. a micro mud?
    Kooneiform 2013-01-28T00:36:15+00:00
  314. Kid Icarus – Tricks
    Simonschreibt. 2013-01-28T22:45:00+00:00
  315. Churning behind the scenes
    Evennia Devblog RSS Feed 2013-01-29T00:00:00+00:00
  316. First, there was the wheel. Then there was another wheel.
    Kooneiform 2013-02-01T03:30:01+00:00
  317. Left 4 Dead 2 – Puke
    Simonschreibt. 2013-02-01T20:25:00+00:00
  318. Sacred 2 – Crystal Reflexion
    Simonschreibt. 2013-02-03T20:47:00+00:00
  319. Sacred 2 – Pulse Shader
    Simonschreibt. 2013-02-04T22:37:00+00:00
  320. A single-threaded multiplexing server in Clojure, first attempt
    Kooneiform 2013-02-05T06:17:31+00:00
  321. Function Types in Go (golang)
    jordan orelli 2013-02-05T19:53:00+00:00
  322. Battlefield Bad Company 2 – Smoke Column
    Simonschreibt. 2013-02-11T13:08:00+00:00
  323. Kara Swisher interview of Jack Dorsey
    Dan Luu 2013-02-12T00:00:00+00:00
  324. server in Clojure, second attempt
    Kooneiform 2013-02-13T05:23:27+00:00
  325. Battlefield 2 – Flag Sound
    Simonschreibt. 2013-02-13T07:50:00+00:00
  326. Sacred 2 – Burning Map
    Simonschreibt. 2013-02-15T14:29:00+00:00
  327. Assassins Creed 3 – Bouncing Light
    Simonschreibt. 2013-02-19T09:52:00+00:00
  328. one
    jordan orelli 2013-02-20T02:49:01+00:00
  329. two
    jordan orelli 2013-02-20T05:03:24+00:00
  330. Airborn – Trees
    Simonschreibt. 2013-02-21T23:03:00+00:00
  331. color study
    jordan orelli 2013-02-22T06:27:00+00:00
  332. 1943 – Retro Shadows
    Simonschreibt. 2013-02-28T22:34:00+00:00
  333. growth
    jordan orelli 2013-03-01T02:50:27+00:00
  334. growth outline
    jordan orelli 2013-03-01T05:04:41+00:00
  335. server in Clojure, third attempt, solely for posterity’s sake
    Kooneiform 2013-03-03T17:15:01+00:00
  336. server in Clojure, fourth attempt
    Kooneiform 2013-03-04T02:11:42+00:00
  337. Metal Gear Rising – Slicing
    Simonschreibt. 2013-03-04T20:54:00+00:00
  338. Latency mitigation strategies (by John Carmack)
    Dan Luu 2013-03-05T00:00:00+00:00
  339. Dead Space 3 – Diffuse Reflections
    Simonschreibt. 2013-03-10T20:24:00+00:00
  340. Who was Rolindar?
    Kooneiform 2013-03-11T02:35:48+00:00
  341. intro to 3d
    jordan orelli 2013-03-11T04:19:57+00:00
  342. Homeworld 2: Backgrounds
    Simonschreibt. 2013-03-15T20:08:00+00:00
  343. Homeworld 2 – Backgrounds Tech
    Simonschreibt. 2013-03-17T22:46:00+00:00
  344. making roguelikes with Clojure
    Kooneiform 2013-03-22T05:22:56+00:00
  345. Diablo 3 – Resource Bubbles
    Simonschreibt. 2013-03-25T20:50:00+00:00
  346. List Out of Lambda
    Steve Losh 2013-03-30T14:00:00+00:00
  347. 007 Legends – The World
    Simonschreibt. 2013-04-01T22:12:00+00:00
  348. hexes
    jordan orelli 2013-04-03T03:02:00+00:00
  349. Homeworld 2 – Engines
    Simonschreibt. 2013-04-05T23:01:00+00:00
  350. Git Koans
    Steve Losh 2013-04-08T10:16:00+00:00
  351. Homeworld 2 – Hyperspace
    Simonschreibt. 2013-04-14T18:41:00+00:00
  352. Bioshock – Glossiness
    Simonschreibt. 2013-04-19T20:54:00+00:00
  353. Starcraft 2 – Localization
    Simonschreibt. 2013-04-24T17:54:00+00:00
  354. Doom 3 – Modding Notes
    Simonschreibt. 2013-04-30T11:28:00+00:00
  355. Doom 3 – Volumetric Glow
    Simonschreibt. 2013-05-01T20:51:00+00:00
  356. flap ya wings, little boids
    jordan orelli 2013-05-07T02:07:22+00:00
  357. Shadow World instead of TutorialMUD?
    TutorialMUD - pileborg.se 2013-05-07T17:30:35+00:00
  358. Doom 3 – HDUI
    Simonschreibt. 2013-05-09T21:00:00+00:00
  359. Meridian
    worst bedtime stories 2013-05-12T23:34:03+00:00
  360. One to Many
    Evennia Devblog RSS Feed 2013-05-13T00:00:00+00:00
  361. Photo
    Infraspace 2013-05-13T00:57:00+00:00
  362. Scribble Cel
    Simonschreibt. 2013-05-15T21:15:00+00:00
  363. Lego Batman – Crawler
    Simonschreibt. 2013-05-21T20:00:00+00:00
  364. Doom3 BFG Code Review
    Fabien Sanglard 2013-05-23T01:08:45+00:00
  365. Dead kids, dead animals, and other such jollity
    How to Spot a Psychopath 2013-05-25T03:12:59+00:00
  366. picked up an ewi over the weekend, figured out how to play the...
    jordan orelli 2013-05-31T02:01:07+00:00
  367. An excuse to use that spider photo again
    How to Spot a Psychopath 2013-06-03T01:04:09+00:00
  368. Small ridiculous object du jour
    How to Spot a Psychopath 2013-06-04T05:04:36+00:00
  369. Dungeon Keeper 2 – Walls
    Simonschreibt. 2013-06-05T22:23:00+00:00
  370. GBA SP Speaker Impulse Response
    Plogue R&D 2013-06-06T14:30:00+00:00
  371. "I will not buy this record, it is the wax tadpole."
    How to Spot a Psychopath 2013-06-12T02:59:27+00:00
  372. Prince Of Persia Code Review
    Fabien Sanglard 2013-06-14T01:08:45+00:00
  373. My ever-vigilant Perpetual-Motion-Claims Patrol
    How to Spot a Psychopath 2013-06-14T07:36:43+00:00
  374. Lego – Studs
    Simonschreibt. 2013-06-21T22:53:00+00:00
  375. Zaps and bangs
    How to Spot a Psychopath 2013-06-26T08:55:30+00:00
  376. Making arcade cabinet impulse responses.
    Plogue R&D 2013-06-29T14:02:00+00:00
  377. 1nsane Carpet 2 – Repetitive Worlds
    Simonschreibt. 2013-07-08T19:46:00+00:00
  378. Track-Best Library Updated
    nklein software 2013-07-08T22:48:25+00:00
  379. A metallurgical detective story
    How to Spot a Psychopath 2013-07-10T12:08:10+00:00
  380. Binding of Isaac – Composition
    Simonschreibt. 2013-07-15T20:46:00+00:00
  381. Inverse functions with fixed-points
    nklein software 2013-07-18T15:44:43+00:00
  382. IRC Graphs
    nklein software 2013-07-24T05:38:01+00:00
  383. Nnnnnnnnnnnyeeeeeowwww
    How to Spot a Psychopath 2013-07-28T04:02:08+00:00
  384. I’ve been replaying Earthbound since its rerelease on the...
    Zac Gorman 2013-08-07T19:14:00+00:00
  385. Mega Man X piece for this year’s Fangamer X Attract Mode...
    Zac Gorman 2013-08-08T15:31:00+00:00
  386. Company of Heroes – Shaded Smoke
    Simonschreibt. 2013-08-09T20:23:00+00:00
  387. The Bowling Game Kata in Functional Common Lisp
    nklein software 2013-08-14T23:47:06+00:00
  388. Second Reality Code Review
    Fabien Sanglard 2013-08-16T01:08:45+00:00
  389. Magical Game Time Vol.1 is now AVAILABLE! MY BOOK IS FINALLY...
    Zac Gorman 2013-08-16T22:54:41+00:00
  390. All that glisters
    How to Spot a Psychopath 2013-08-18T09:20:38+00:00
  391. Company of Heroes – Flamethrower
    Simonschreibt. 2013-08-20T19:23:00+00:00
  392. Doom III BFG Documentation
    Fabien Sanglard 2013-08-31T01:08:45+00:00
  393. About danluu.com
    Dan Luu 2013-09-01T00:00:00+00:00
  394. Reports of MWO's death have been somewhat exaggerated
    How to Spot a Psychopath 2013-09-03T03:39:56+00:00
  395. Teach, Don't Tell
    Steve Losh 2013-09-03T10:55:00+00:00
  396. More Doom III BFG Documentation
    Fabien Sanglard 2013-09-04T01:08:45+00:00
  397. Verilog is weird
    Dan Luu 2013-09-07T00:00:00+00:00
  398. Lines Are Big Circles
    nklein software 2013-09-13T17:18:10+00:00
  399. Writing safe Verilog
    Dan Luu 2013-09-15T00:00:00+00:00
  400. On killing numerous aliens with a rubber-band gun
    How to Spot a Psychopath 2013-09-17T03:49:37+00:00
  401. You found me!
    Simonschreibt. 2013-09-20T22:54:22+00:00
  402. Decyphering the Business Card Raytracer
    Fabien Sanglard 2013-09-21T01:08:45+00:00
  403. self portrait
    jordan orelli 2013-09-23T14:41:00+00:00
  404. World of Torch Siege – Blended Trunks
    Simonschreibt. 2013-09-27T19:05:09+00:00
  405. Randomize HN
    Dan Luu 2013-10-04T00:00:00+00:00
  406. Learning Legendary Hardware
    Fabien Sanglard 2013-10-07T01:08:45+00:00
  407. Super Hot – Turn-based Action
    Simonschreibt. 2013-10-08T20:10:21+00:00
  408. chipcrusher re-sampling vs frequency response
    Plogue R&D 2013-10-10T18:03:00+00:00
  409. Power-On Self-Test...
    int10h.org - VileR's blog 2013-10-10T23:06:43+00:00
  410. Using Photoshop as a CGA Bitmap Paint Program
    int10h.org - VileR's blog 2013-10-12T19:38:27+00:00
  411. The Mazes of Shamus - IBM PC Version
    int10h.org - VileR's blog 2013-10-21T04:28:51+00:00
  412. Mega Evolutions
    Zac Gorman 2013-10-21T17:53:31+00:00
  413. A list of Evennia topics
    Evennia Devblog RSS Feed 2013-10-22T00:00:00+00:00
  414. Little morning warm up drawing feat. some characters from Night...
    Zac Gorman 2013-10-25T14:35:20+00:00
  415. How to discourage open source contributions
    Dan Luu 2013-10-27T00:00:00+00:00
  416. Testing exit values in Bash
    Arabesque 2013-10-28T05:56:37+00:00
  417. Been thinking about Skull Kid and Majora’s Mask. The more...
    Zac Gorman 2013-10-28T20:05:34+00:00
  418. Photo
    Infraspace 2013-11-04T19:51:27+00:00
  419. Thanks.
    Infraspace 2013-11-05T20:16:44+00:00
  420. Why hardware development is hard
    Dan Luu 2013-11-10T00:00:00+00:00
  421. Oblivion Territory: Tree vs. Palm
    Simonschreibt. 2013-11-15T21:51:46+00:00
  422. Photo
    Zac Gorman 2013-11-22T15:19:48+00:00
  423. Sacred 2 – Floating Point Numbers
    Simonschreibt. 2013-11-22T17:53:19+00:00
  424. Photo
    Zac Gorman 2013-11-22T20:06:08+00:00
  425. dinner, 11-25-13
    food bores me 2013-11-26T04:48:00+00:00
  426. despair snack, 11-26-13
    food bores me 2013-11-26T21:46:13+00:00
  427. Out of band mergings
    Evennia Devblog RSS Feed 2013-11-28T00:00:00+00:00
  428. breakfast, 11-30-13
    food bores me 2013-11-30T15:41:01+00:00
  429. Handmade Normal Maps
    Simonschreibt. 2013-12-03T23:02:56+00:00
  430. lunch, close week, 12-2-03
    food bores me 2013-12-04T05:31:00+00:00
  431. PCA is not a panacea
    Dan Luu 2013-12-13T00:00:00+00:00
  432. Imaginary Realities is back
    Evennia Devblog RSS Feed 2013-12-16T00:00:00+00:00
  433. square flower
    jordan orelli 2013-12-23T17:18:21+00:00
  434. Data alignment and caches
    Dan Luu 2014-01-02T00:00:00+00:00
  435. Tomb Raider – Laras Hot Secrets
    Simonschreibt. 2014-01-02T12:58:12+00:00
  436. Hoplite News
    Magma Fortress 2014-01-04T07:29:00+00:00
  437. Do programmers need math?
    Dan Luu 2014-01-09T00:00:00+00:00
  438. Whipped this up on my lunch break to let everybody know...
    Zac Gorman 2014-01-09T19:42:40+00:00
  439. Prey – Evil Buttons
    Simonschreibt. 2014-01-09T23:25:53+00:00
  440. lunch, 1-8-14
    food bores me 2014-01-10T06:53:00+00:00
  441. Photo
    Zac Gorman 2014-01-10T19:38:01+00:00
  442. Looking forwards and backwards
    Evennia Devblog RSS Feed 2014-01-24T00:00:00+00:00
  443. pulse
    jordan orelli 2014-01-26T21:40:00+00:00
  444. additive test
    jordan orelli 2014-01-28T13:37:21+00:00
  445. FUN FACT: (1) head of cabbage, when mixed with (1) potato and...
    food bores me 2014-01-29T00:35:00+00:00
  446. loving where this blog is going!!!
    food bores me 2014-01-29T22:10:57+00:00
  447. no water, no fish
    jordan orelli 2014-02-02T22:30:00+00:00
  448. Photo
    Zac Gorman 2014-02-04T20:57:00+00:00
  449. I promised you failure, and lo, here it is. This is a large bowl...
    food bores me 2014-02-06T06:10:00+00:00
  450. Moving from Google Code to Github
    Evennia Devblog RSS Feed 2014-02-08T00:00:00+00:00
  451. Why don't schools teach debugging?
    Dan Luu 2014-02-08T00:00:00+00:00
  452. CAUTION INSERT SECURELY LEST POWER CORD SHOULD BE DETACHED IN...
    Infraspace 2014-02-10T16:25:00+00:00
  453. Algorithms and Data structures books: One size doesn't fit them all
    Fabien Sanglard 2014-02-14T01:08:45+00:00
  454. This is my entire dinner. This is it, this is what a grown adult...
    food bores me 2014-02-19T03:01:00+00:00
  455. Don’t starve, Diablo – Parallax 7
    Simonschreibt. 2014-02-24T22:38:37+00:00
  456. Poupée de Son - Narrative Game based on Grimm’s “Hare’s...
    winnie song 2014-02-25T04:44:00+00:00
  457. ULFBERT - Name of an unusually strong and lasting Scandinavian...
    winnie song 2014-02-25T04:45:00+00:00
  458. Repair - A game about fortifying the ground you stand on. [DL...
    winnie song 2014-02-25T04:45:00+00:00
  459. Hellsmouth Concept
    winnie song 2014-02-25T04:46:00+00:00
  460. Hellsmouth Concept Animation - 2012
    winnie song 2014-02-25T04:47:00+00:00
  461. That time Oracle tried to have a professor fired for benchmarking their database
    Dan Luu 2014-03-05T00:00:00+00:00
  462. Too big to believe
    How to Spot a Psychopath 2014-03-08T00:06:00+00:00
  463. That bogus gender gap article
    Dan Luu 2014-03-09T00:00:00+00:00
  464. 7DRL Preparation
    Magma Fortress 2014-03-09T03:45:00+00:00
  465. 7DRL: Day 2
    Magma Fortress 2014-03-10T15:12:00+00:00
  466. 7DRL: Day 3
    Magma Fortress 2014-03-11T15:13:00+00:00
  467. The Computer Graphics Library
    Fabien Sanglard 2014-03-12T01:08:45+00:00
  468. 7DRL: Day 4
    Magma Fortress 2014-03-12T14:58:00+00:00
  469. 7DRL: Day 7 (Ragtag is a success)
    Magma Fortress 2014-03-15T12:55:00+00:00
  470. Editing binaries
    Dan Luu 2014-03-23T00:00:00+00:00
  471. 🕸️ My Journey Into Programming
    Nathan Youngman 2014-03-30T00:00:00+00:00
  472. Git Source Code Review
    Fabien Sanglard 2014-03-30T01:08:45+00:00
  473. Ate this two week old kale and beet salad with spicy peanut...
    food bores me 2014-04-01T03:09:21+00:00
  474. This weekend, I’m taking a couple days off from my book...
    Zac Gorman 2014-04-02T00:29:04+00:00
  475. Windows AC/Row/Infinite
    Simonschreibt. 2014-04-02T20:59:03+00:00
  476. Data-driven bug finding
    Dan Luu 2014-04-06T00:00:00+00:00
  477. Some GBC-style sprites I did for Frog Egg. (I reuploaded because...
    i make video games 2014-04-09T20:47:08+00:00
  478. My book is finally announced! Costume Quest: Invasion of the...
    Zac Gorman 2014-04-10T16:42:00+00:00
  479. happy egg friday
    i make video games 2014-04-18T17:50:14+00:00
  480. Whatever this frozen peas + eggplant + can of kidney beans...
    food bores me 2014-04-22T14:03:00+00:00
  481. necking
    jordan orelli 2014-04-25T19:10:13+00:00
  482. I was on Etsy’s Instagram feed the other day.  Neat :)
    jordan orelli 2014-04-28T20:56:02+00:00
  483. Did some pixel art of various cat-based tiles for a weird game...
    i make video games 2014-04-29T06:41:58+00:00
  484. My Lunch Monstrosity, A Greatest Hits Album Featuring such...
    food bores me 2014-04-29T15:02:07+00:00
  485. I made some cats
    i make video games 2014-04-30T08:40:17+00:00
  486. As a follow up to my past posts, I’ve finally separated...
    i make video games 2014-05-01T08:56:00+00:00
  487. Your art is super awesome :)
    i make video games 2014-05-02T03:56:41+00:00
  488. Thanks to the Stanley Cup Playoffs, a very jam-packed work...
    food bores me 2014-05-06T03:39:00+00:00
  489. Listen, it was 8 in the morning, I was wildly hungover, the bus...
    food bores me 2014-05-12T14:40:43+00:00
  490. Shamus Keyboard Woes Explained
    int10h.org - VileR's blog 2014-05-13T10:43:38+00:00
  491. self portrait, cut paper
    jordan orelli 2014-05-14T12:22:42+00:00
  492. A 2-player one screen game in which the players take turns...
    winnie song 2014-05-15T09:15:00+00:00
  493. Welcome to the new ASCIImator!
    Posts on asie's blog 2014-05-15T23:00:00+00:00
  494. Arithmetic Games Set 1: a Peek into One of the First-Ever IBM PC Games
    int10h.org - VileR's blog 2014-05-15T23:54:46+00:00
  495. Imaginary Realities volume 6, issue 1
    Evennia Devblog RSS Feed 2014-05-16T00:00:00+00:00
  496. New movement options for Hoplite
    Magma Fortress 2014-05-16T13:33:00+00:00
  497. Newsletter #1 - A New Hope
    Neovim 2014-06-06T00:00:00+00:00
  498. Trespasser: Jurassic Park CG Source Code Review
    Fabien Sanglard 2014-06-10T01:08:45+00:00
  499. It’s a long way off still, but started mocking up a music...
    i make video games 2014-06-12T21:42:00+00:00
  500. Bringing back Python memory
    Evennia Devblog RSS Feed 2014-06-15T00:00:00+00:00
  501. Mer-Maid Manor (2P/Playbot) Can you clean the manor before the...
    Zac Gorman 2014-06-26T14:04:43+00:00
  502. Webby stuff
    Evennia Devblog RSS Feed 2014-06-30T00:00:00+00:00
  503. On the pulverisation of potatoes
    How to Spot a Psychopath 2014-07-02T05:37:34+00:00
  504. No time to make a meal between work shifts? Why not whip...
    food bores me 2014-07-02T17:02:04+00:00
  505. Newsletter #2 - Perchance to Dream
    Neovim 2014-07-04T00:00:00+00:00
  506. Frankenphone
    How to Spot a Psychopath 2014-07-06T02:32:28+00:00
  507. My Game Boy music maker, Bleep, has come a long way already! You...
    i make video games 2014-07-06T19:28:00+00:00
  508. I made this little jingle in Bleep. Not sure what it’s...
    i make video games 2014-07-08T01:40:38+00:00
  509. This is the battle screen for an RPG I was once making as a...
    i make video games 2014-07-18T02:27:36+00:00
  510. cutie witch girl (thing I never finished)
    i make video games 2014-07-18T03:56:29+00:00
  511. Hey, your bleep music creator sounds very awesome, I would definitely like to try it out some day. The main plus is that it looks way more simple than LSDJ, the only minus I see for now is short length of tracks (only about 3 minutes) - a lot of compositions are longer than that ;o I hope it will have a possibility to make longer tracks in the future :) Keep up the good work!
    i make video games 2014-07-18T04:26:06+00:00
  512. Some ink doodles I did with my Pentel brush pen a while back
    i make video games 2014-07-18T04:57:36+00:00
  513. This is a map that I was making for an exploration sidescroller...
    i make video games 2014-07-18T18:17:00+00:00
  514. Here’s a GIF full of spooky warped faces that move...
    i make video games 2014-07-18T19:24:00+00:00
  515. Whoops, my earlier GIF was fixed. Need a bunch of optimizations for...
    i make video games 2014-07-18T20:39:00+00:00
  516. Hoplite 2.3 progress
    Magma Fortress 2014-07-19T05:08:00+00:00
  517. Revenants, a Metroid-style sidescroller I was making at one...
    i make video games 2014-07-21T04:26:00+00:00
  518. Game Boy Wavy Scanline Effect #2 (better quality:...
    i make video games 2014-07-21T05:01:00+00:00
  519. Hoplite 2.3 progress II
    Magma Fortress 2014-07-25T13:12:00+00:00
  520. Leaderboards and balance changes for Hoplite
    Magma Fortress 2014-08-03T02:23:00+00:00
  521. Dance my puppets
    Evennia Devblog RSS Feed 2014-08-04T00:00:00+00:00
  522. A Grim Fandango poster that I worked to death (no pun...
    Zac Gorman 2014-08-04T14:51:48+00:00
  523. Bug fix release for Hoplite
    Magma Fortress 2014-08-06T09:54:00+00:00
  524. Game Engine Black Books
    Fabien Sanglard 2014-08-07T01:08:45+00:00
  525. Let's compile like it's 1992
    Fabien Sanglard 2014-08-10T01:08:45+00:00
  526. Google wage fixing, 11-CV-02509-LHK, ORDER DENYING PLAINTIFFS' MOTION FOR PRELIMINARY APPROVAL OF SETTLEMENTS WITH ADOBE, APPLE, GOOGLE, AND INTEL
    Dan Luu 2014-08-14T00:00:00+00:00
  527. Verilog Won & VHDL Lost? — You Be The Judge!
    Dan Luu 2014-08-14T00:00:00+00:00
  528. here’s the thing. I hate wasting food. if you and I are in...
    food bores me 2014-08-15T15:01:17+00:00
  529. The Road to Alpha, Week 25 - Imperfect Knowledge
    Citybound Devblog 2014-08-27T01:30:20+00:00
  530. Magma Music
    Magma Fortress 2014-08-27T14:30:00+00:00
  531. The Road to Alpha, Week 26 - Commute & Competition
    Citybound Devblog 2014-09-03T02:00:38+00:00
  532. Cards are the Future
    Magma Fortress 2014-09-04T12:57:00+00:00
  533. Newsletter #3 - Better Late than Never
    Neovim 2014-09-06T00:00:00+00:00
  534. Another bug fix release for Hoplite
    Magma Fortress 2014-09-06T03:28:00+00:00
  535. The Road to Alpha, Week 27 - Front Lawn Freeway
    Citybound Devblog 2014-09-09T22:18:34+00:00
  536. Plogue livenes
    Plogue R&D 2014-09-11T14:49:00+00:00
  537. STILL is a game about your hometown. [DL] Design, Visual | Made...
    winnie song 2014-09-13T05:09:00+00:00
  538. THE FOUR is a 1-player strategy game. You play as one of four...
    winnie song 2014-09-13T05:17:00+00:00
  539. Do Something is a 4-player local multiplayer game where you work...
    winnie song 2014-09-13T17:08:50+00:00
  540. WHISTLEBLOWER - An AGS game about whistleblowing. You play as a...
    winnie song 2014-09-13T17:08:51+00:00
  541. Parallel|Stitch is a game inspired by Sophie Houlden’s...
    winnie song 2014-09-13T17:08:53+00:00
  542. LBVQ is a game that teaches binary very quickly.    |     Made...
    winnie song 2014-09-13T17:08:55+00:00
  543. THIEF is a game where you are waiting for the bus with a...
    winnie song 2014-09-13T17:08:57+00:00
  544. First Impressions is a game about meeting the in-laws for the...
    winnie song 2014-09-13T17:09:03+00:00
  545. ASAP is a game about scheduling your boss’s life on your...
    winnie song 2014-09-13T17:09:05+00:00
  546. LIGHTRAFT is a game played with a MIDI controller and keen...
    winnie song 2014-09-13T17:16:00+00:00
  547. DRUNKWALK is a game about calling it a night – Let your heavy...
    winnie song 2014-09-13T17:17:00+00:00
  548. Bloodsport is a game about being in the woods with a beast and a...
    winnie song 2014-09-13T17:17:01+00:00
  549. BOSSA game about the interaction between a player and a hostile,...
    winnie song 2014-09-13T18:49:00+00:00
  550. onipress: FREE preview of idrawnintendo and doublefine’s...
    Zac Gorman 2014-09-15T19:05:43+00:00
  551. 14 rough ideas for SEGA t-shirts, done as an exercise. Maybe I...
    Zac Gorman 2014-09-16T04:03:59+00:00
  552. Another batch of SEGA t-shirt ideas. I had to complete the...
    Zac Gorman 2014-09-16T20:58:30+00:00
  553. The Road to Alpha, Week 28 - You Cut Me Off!
    Citybound Devblog 2014-09-17T02:47:34+00:00
  554. chipspeech Diary, Part 1
    Plogue R&D 2014-09-18T17:14:00+00:00
  555. I made this GIF of all the eversions from Eversion NES. (This is...
    i make video games 2014-09-19T08:20:00+00:00
  556. The Road to Alpha, Week 29 - Exciting Times
    Citybound Devblog 2014-09-23T23:54:51+00:00
  557. Slowly moving through town
    Evennia Devblog RSS Feed 2014-10-02T00:00:00+00:00
  558. The Road to Alpha, Week 30 - New Place!
    Citybound Devblog 2014-10-07T23:05:24+00:00
  559. The Road to Alpha, Week 31 - New Place, For Real!
    Citybound Devblog 2014-10-15T00:06:22+00:00
  560. Assembly v. intrinsics
    Dan Luu 2014-10-19T00:00:00+00:00
  561. Something New: Livestream Reviews
    Citybound Devblog 2014-10-20T11:17:11+00:00
  562. 10/22 Livestream Review
    Citybound Devblog 2014-10-23T18:15:44+00:00
  563. Mandelbrot: The Game
    Magma Fortress 2014-10-24T14:24:00+00:00
  564. Horse Simulator
    Magma Fortress 2014-10-25T06:13:00+00:00
  565. The Battlestar Encyclopedia
    Magma Fortress 2014-10-26T05:25:00+00:00
  566. Minehunter
    Magma Fortress 2014-10-27T13:21:00+00:00
  567. Conway's Game of Slime Creatures
    Magma Fortress 2014-10-28T09:03:00+00:00
  568. Still Life
    Magma Fortress 2014-10-29T04:24:00+00:00
  569. 10/31 Special Announcement
    Citybound Devblog 2014-10-31T12:13:43+00:00
  570. Caches: LRU v. random
    Dan Luu 2014-11-03T00:00:00+00:00
  571. Testing v. informal reasoning
    Dan Luu 2014-11-03T00:00:00+00:00
  572. CLWB and PCOMMIT
    Dan Luu 2014-11-05T00:00:00+00:00
  573. Newsletter #4 - Thanksvimming Day
    Neovim 2014-11-07T00:00:00+00:00
  574. Literature review on the benefits of static types
    Dan Luu 2014-11-07T00:00:00+00:00
  575. Prompt directory shortening
    Arabesque 2014-11-07T09:13:47+00:00
  576. Rust, Lifetimes, and Collections - Faultlore
    Faultlore 2014-11-09T00:00:00+00:00
  577. How often is the build broken?
    Dan Luu 2014-11-10T00:00:00+00:00
  578. The ol' Ball and Chain
    Magma Fortress 2014-11-10T12:30:00+00:00
  579. Speeding up this site by 50x
    Dan Luu 2014-11-17T00:00:00+00:00
  580. One week of bugs
    Dan Luu 2014-11-18T00:00:00+00:00
  581. The Road to Alpha, Week 36 - A Sign of Life
    Citybound Devblog 2014-11-18T14:13:04+00:00
  582. TF-IDF linux commits
    Dan Luu 2014-11-24T00:00:00+00:00
  583. Photo
    Zac Gorman 2014-11-25T00:35:11+00:00
  584. The Road to Alpha, Week 37 - Imaginary Progress
    Citybound Devblog 2014-11-25T22:37:58+00:00
  585. Zelda Wind Waker – Hyrule Travel Guide
    Simonschreibt. 2014-11-26T20:49:40+00:00
  586. Markets, discrimination, and "lowering the bar"
    Dan Luu 2014-12-01T00:00:00+00:00
  587. The Road to Alpha, Week 38 - Curve Control
    Citybound Devblog 2014-12-03T02:15:53+00:00
  588. Malloc tutorial
    Dan Luu 2014-12-04T00:00:00+00:00
  589. Forever Mining Print now available at Fangamer!
    nimasprout - Art by Nicole Gustafsson 2014-12-09T02:11:00+00:00
  590. Crafty Wonderland Colossal Holiday Show in Portland
    nimasprout - Art by Nicole Gustafsson 2014-12-11T16:30:00+00:00
  591. The Road to Alpha, Week 40 - Hyper-Active
    Citybound Devblog 2014-12-16T22:37:43+00:00
  592. Integer overflow checking cost
    Dan Luu 2014-12-17T00:00:00+00:00
  593. 2014 in Review
    Citybound Devblog 2014-12-25T17:37:31+00:00
  594. A review of the Julia language
    Dan Luu 2014-12-28T00:00:00+00:00
  595. chipspeech Diary, Part 2
    Plogue R&D 2014-12-29T22:47:00+00:00
  596. BADBLOOD BADBLOOD is a deadly game of hide & seek. It is a...
    winnie song 2014-12-31T17:25:00+00:00
  597. New Year - New Solo at Gallery 1988
    nimasprout - Art by Nicole Gustafsson 2015-01-02T16:33:00+00:00
  598. Rei Ayanami – Inner eyes
    Simonschreibt. 2015-01-07T12:35:29+00:00
  599. Developer Diary #1: Where do you think you're going?
    Citybound Devblog 2015-01-07T23:42:48+00:00
  600. Developer Diary #2: Intersection soup
    Citybound Devblog 2015-01-10T23:11:42+00:00
  601. What's new in CPUs since the 80s?
    Dan Luu 2015-01-11T00:00:00+00:00
  602. Cute Frog! A fun little visual novel mockup I started on. GBC...
    i make video games 2015-01-11T04:53:12+00:00
  603. Developer Diary #3: The Struggle
    Citybound Devblog 2015-01-12T15:13:57+00:00
  604. Pop Terrariums at Gallery 1988
    nimasprout - Art by Nicole Gustafsson 2015-01-14T20:17:00+00:00
  605. Developer Diary #4: Traffic Anarchy
    Citybound Devblog 2015-01-14T22:52:36+00:00
  606. A HashMap in Rust - What's a HashMap? - Faultlore
    Faultlore 2015-01-15T00:00:00+00:00
  607. Building Django proxies and MUD libraries
    Evennia Devblog RSS Feed 2015-01-19T00:00:00+00:00
  608. Blog monetization
    Dan Luu 2015-01-24T00:00:00+00:00
  609. Snoopy Valentine - Official Print Release with Dark Hall Mansion
    nimasprout - Art by Nicole Gustafsson 2015-01-28T18:08:00+00:00
  610. Shell config subfiles
    Arabesque 2015-01-29T11:01:09+00:00
  611. BuildCraft History and Design
    Posts on asie's blog 2015-01-29T23:00:00+00:00
  612. CPU backdoors
    Dan Luu 2015-02-03T00:00:00+00:00
  613. AI doesn't have to be very good to displace humans
    Dan Luu 2015-02-15T00:00:00+00:00
  614. Goodhearting IQ, cholesterol, and tail latency
    Dan Luu 2015-03-05T00:00:00+00:00
  615. Developer Diary #5: Back to Business
    Citybound Devblog 2015-03-05T22:29:50+00:00
  616. Challenge Mode
    Magma Fortress 2015-03-06T22:52:00+00:00
  617. What happens when you load a URL?
    Dan Luu 2015-03-07T00:00:00+00:00
  618. Challenge Mode Progress
    Magma Fortress 2015-03-09T11:21:00+00:00
  619. Given that we spend little effort on testing, how should we test software?
    Dan Luu 2015-03-10T00:00:00+00:00
  620. Hoplite Challenge Mode is ready
    Magma Fortress 2015-03-15T09:32:00+00:00
  621. Postcard Correspondence opens at Gallery 1988 tonight!
    nimasprout - Art by Nicole Gustafsson 2015-03-20T15:54:00+00:00
  622. Developer Diary #6: Zoning, Struggling, Parceling
    Citybound Devblog 2015-03-21T22:17:08+00:00
  623. Reading citations is easier than most people think
    Dan Luu 2015-03-29T00:00:00+00:00
  624. New Prints for Spring/Summer.
    nimasprout - Art by Nicole Gustafsson 2015-03-31T21:35:00+00:00
  625. Developer Diary #7 - The Economic Model
    Citybound Devblog 2015-04-01T14:11:52+00:00
  626. Newsletter #5 - Out of the Box
    Neovim 2015-04-03T00:00:00+00:00
  627. CGA in 1024 Colors - a New Mode: the Illustrated Guide
    int10h.org - VileR's blog 2015-04-15T20:56:27+00:00
  628. Mouse Guard: Legends of the Guard Vol 3, #2
    nimasprout - Art by Nicole Gustafsson 2015-04-19T17:56:00+00:00
  629. Pre-Pooping Your Pants With Rust - Faultlore
    Faultlore 2015-04-27T00:00:00+00:00
  630. DevDiary #8 - Technical Background Work
    Citybound Devblog 2015-04-27T16:17:39+00:00
  631. Photo
    ♘ 2015-04-29T09:30:09+00:00
  632. Photo
    ♘ 2015-04-30T09:30:14+00:00
  633. Photo
    ♘ 2015-05-03T12:52:08+00:00
  634. We used to build steel mills near cheap power. Now that's where we build datacenters
    Dan Luu 2015-05-04T00:00:00+00:00
  635. Crafty Wonderland Colossial Spring Sale this May 9th
    nimasprout - Art by Nicole Gustafsson 2015-05-07T03:55:00+00:00
  636. Documenting Python without Sphinx
    Evennia Devblog RSS Feed 2015-05-09T00:00:00+00:00
  637. Things goin on
    Evennia Devblog RSS Feed 2015-05-11T00:00:00+00:00
  638. Crafty Wonderland Recap
    nimasprout - Art by Nicole Gustafsson 2015-05-12T20:39:00+00:00
  639. Haunted Depths - New Print at Tiny Showcase
    nimasprout - Art by Nicole Gustafsson 2015-05-13T16:46:00+00:00
  640. Photo
    ♘ 2015-05-16T16:32:28+00:00
  641. Advantages of monorepos
    Dan Luu 2015-05-17T00:00:00+00:00
  642. Challenge Mode Comes to Android and iOS
    Magma Fortress 2015-05-18T10:11:00+00:00
  643. A defense of boring languages
    Dan Luu 2015-05-25T00:00:00+00:00
  644. The googlebot monopoly
    Dan Luu 2015-05-27T00:00:00+00:00
  645. Dreaming big?
    Evennia Devblog RSS Feed 2015-05-30T00:00:00+00:00
  646. Slashdot and Sourceforge
    Dan Luu 2015-05-31T00:00:00+00:00
  647. Rust, Generics, and Collections - Faultlore
    Faultlore 2015-06-03T00:00:00+00:00
  648. Rust Collections Case Study: BTreeMap - Faultlore
    Faultlore 2015-06-05T00:00:00+00:00
  649. Photo
    ♘ 2015-06-08T15:05:11+00:00
  650. June 2015 Update (Mystery Feature)
    Citybound Devblog 2015-06-09T13:27:12+00:00
  651. Photo
    ♘ 2015-06-10T10:45:03+00:00
  652. Fantastical Flora and Fauna exhbit at Gallery Nucleus
    nimasprout - Art by Nicole Gustafsson 2015-06-12T00:16:00+00:00
  653. Need your help!
    Evennia Devblog RSS Feed 2015-06-15T00:00:00+00:00
  654. Artwork from Fantastical Fauna and Flora at Gallery Nucleus
    nimasprout - Art by Nicole Gustafsson 2015-06-15T15:35:00+00:00
  655. The Road to Alpha, Week 66 - More on Planning Mode
    Citybound Devblog 2015-06-20T01:42:35+00:00
  656. Announcing the Evennia example-game project "Ainneve"
    Evennia Devblog RSS Feed 2015-06-22T00:00:00+00:00
  657. Recent Shows at iam8bit Gallery in LA
    nimasprout - Art by Nicole Gustafsson 2015-06-24T17:05:00+00:00
  658. Out and about at Mt. Rainier National Forest
    nimasprout - Art by Nicole Gustafsson 2015-07-06T15:20:00+00:00
  659. Discrete Arctan in 6502
    dustmop.io blog 2015-07-22T15:18:43+00:00
  660. Sacred 2 – Fake Mirror
    Simonschreibt. 2015-07-23T00:11:59+00:00
  661. Bag Review: National Geographic A2540
    Steve Losh 2015-07-24T18:42:00+00:00
  662. Bag Review: National Geographic MC5350
    Steve Losh 2015-07-26T13:35:00+00:00
  663. Photo
    ♘ 2015-07-30T15:10:50+00:00
  664. Batsly Adams – Star Versus Production
    dustmop.io blog 2015-07-31T18:21:23+00:00
  665. New Design
    int10h.org - VileR's blog 2015-08-03T05:14:28+00:00
  666. Common ain't no language I ever heard of!
    Smerg Development Journal 2015-08-04T17:32:32+00:00
  667. This modern world
    Smerg Development Journal 2015-08-05T16:47:47+00:00
  668. 8088 MPH Final: Old vs. New CGA (and Other Gory Details)
    int10h.org - VileR's blog 2015-08-07T00:39:06+00:00
  669. Resources, Part II
    Smerg Development Journal 2015-08-08T16:38:21+00:00
  670. August 2015 Update - A week with Michael
    Citybound Devblog 2015-08-09T10:25:14+00:00
  671. 101 Monochrome Mazes: Why Not Color?
    int10h.org - VileR's blog 2015-08-09T13:31:32+00:00
  672. untitled
    Smerg Development Journal 2015-08-14T17:15:34+00:00
  673. Skilling it up
    Smerg Development Journal 2015-08-15T07:42:11+00:00
  674. Render Hell – Book V
    Simonschreibt. 2015-08-16T17:00:36+00:00
  675. Render Hell – Book IV
    Simonschreibt. 2015-08-16T17:01:25+00:00
  676. Render Hell – Book III
    Simonschreibt. 2015-08-16T17:02:55+00:00
  677. Render Hell – Book II
    Simonschreibt. 2015-08-16T17:03:20+00:00
  678. Render Hell – Book I
    Simonschreibt. 2015-08-16T17:04:42+00:00
  679. Render Hell 2.0
    Simonschreibt. 2015-08-16T17:05:21+00:00
  680. Ghost in the Finite State Machine
    Smerg Development Journal 2015-08-16T17:13:36+00:00
  681. Photo
    ♘ 2015-08-17T17:15:27+00:00
  682. Blocking blocks block path! We go NOWHERE!
    Smerg Development Journal 2015-08-19T19:05:26+00:00
  683. Reading postmortems
    Dan Luu 2015-08-20T00:00:00+00:00
  684. Photo
    ♘ 2015-08-25T09:38:37+00:00
  685. Photo
    ♘ 2015-08-25T09:38:47+00:00
  686. A wagon load of post summer updates
    Evennia Devblog RSS Feed 2015-08-27T00:00:00+00:00
  687. Accounting Department
    Smerg Development Journal 2015-08-27T21:10:57+00:00
  688. Steve Yegge's prediction record
    Dan Luu 2015-08-31T00:00:00+00:00
  689. atonal 2015
    @mntmn 2015-09-05T13:50:05+00:00
  690. atonal 2015
    @mntmn 2015-09-05T13:52:07+00:00
  691. atonal 2015
    @mntmn 2015-09-05T13:53:27+00:00
  692. heart and penis sprites we made on commodore 128 in BASIC
    @mntmn 2015-09-05T13:55:10+00:00
  693. some breakbeats in ohm that i liked
    @mntmn 2015-09-05T13:56:30+00:00
  694. uridium 2 intro on amiga 1200
    @mntmn 2015-09-05T13:58:15+00:00
  695. meganalicerose: When your shoes match your leggings 💁...
    @mntmn 2015-09-07T07:46:56+00:00
  696. Interim OS running on Interim computer prototype. (Details at...
    @mntmn 2015-09-09T13:53:40+00:00
  697. Hold the RESET button while turning the power off
    Smerg Development Journal 2015-09-09T18:16:38+00:00
  698. untitled
    Smerg Development Journal 2015-09-09T19:19:58+00:00
  699. Re-Re-Revisiting Skills
    Smerg Development Journal 2015-09-13T19:47:50+00:00
  700. Lightbulb over head
    Smerg Development Journal 2015-09-14T18:40:37+00:00
  701. Out and about at Cannon Beach, Oregon
    nimasprout - Art by Nicole Gustafsson 2015-09-15T19:46:00+00:00
  702. Photo
    @mntmn 2015-09-15T22:19:29+00:00
  703. More ideas regarding Exits
    Smerg Development Journal 2015-09-16T07:15:30+00:00
  704. Photo
    ♘ 2015-09-17T18:24:09+00:00
  705. Changed changes of changing
    Smerg Development Journal 2015-09-18T18:17:02+00:00
  706. One step back, two steps forward
    Smerg Development Journal 2015-09-21T17:55:44+00:00
  707. New Shop!
    nimasprout - Art by Nicole Gustafsson 2015-09-21T18:40:00+00:00
  708. The Road to Alpha, Week 89 - Theory and Practice
    Citybound Devblog 2015-09-23T03:40:24+00:00
  709. ALL the resources!
    Smerg Development Journal 2015-09-23T04:13:25+00:00
  710. Pushing through a straw
    Evennia Devblog RSS Feed 2015-09-24T00:00:00+00:00
  711. Photo
    Infraspace 2015-09-26T06:10:32+00:00
  712. Photo
    Infraspace 2015-09-26T06:18:44+00:00
  713. Oh, right, that.
    Smerg Development Journal 2015-09-26T06:28:54+00:00
  714. Enter the new Exits
    Smerg Development Journal 2015-09-28T19:36:46+00:00
  715. Evennia on `podcast.__init__`
    Evennia Devblog RSS Feed 2015-09-29T00:00:00+00:00
  716. Slowlock
    Dan Luu 2015-09-30T00:00:00+00:00
  717. Video
    @mntmn 2015-09-30T09:10:09+00:00
  718. Trust  your technolust.(Get a Cyberdelia sticker to go with your...
    Cyberdelia NYC 2015-10-01T02:15:21+00:00
  719. All the shows I forgot to post.
    nimasprout - Art by Nicole Gustafsson 2015-10-01T16:25:00+00:00
  720. Emoting System
    Evennia Devblog RSS Feed 2015-10-02T00:00:00+00:00
  721. Another quick fun idea
    Smerg Development Journal 2015-10-03T05:01:48+00:00
  722. Why Intel added cache partitioning
    Dan Luu 2015-10-04T00:00:00+00:00
  723. Watchdog – Problems
    Simonschreibt. 2015-10-04T14:01:43+00:00
  724. Watchdog – Gallery
    Simonschreibt. 2015-10-04T14:02:24+00:00
  725. Watchdog – Mail
    Simonschreibt. 2015-10-04T14:03:27+00:00
  726. Watchdog – Compare
    Simonschreibt. 2015-10-04T14:04:07+00:00
  727. Watchdog – Convert
    Simonschreibt. 2015-10-04T14:06:58+00:00
  728. Watchdog – Take Screenshots
    Simonschreibt. 2015-10-04T14:07:15+00:00
  729. Watchdog – Prepare your Game
    Simonschreibt. 2015-10-04T14:08:10+00:00
  730. Watchdog – Structure
    Simonschreibt. 2015-10-04T14:09:21+00:00
  731. Watchdog Script
    Simonschreibt. 2015-10-04T14:10:01+00:00
  732. Diablo Gate
    Simonschreibt. 2015-10-08T09:27:26+00:00
  733. “Never send a boy to do a woman’s job.”–Acid BurnNice initial...
    Cyberdelia NYC 2015-10-08T20:08:57+00:00
  734. more dogs have been to space than people who genuinely love you
    @mntmn 2015-10-09T09:43:26+00:00
  735. Illustrations and soaps
    Evennia Devblog RSS Feed 2015-10-11T00:00:00+00:00
  736. How do computers have a sense of time?
    @mntmn 2015-10-11T10:57:27+00:00
  737. inblack-wetrust: Undercover ss2014
    @mntmn 2015-10-12T14:57:00+00:00
  738. Open Assets via Text
    Simonschreibt. 2015-10-12T21:39:45+00:00
  739. Photo
    @mntmn 2015-10-12T23:39:50+00:00
  740. Halloween Prints and more!
    nimasprout - Art by Nicole Gustafsson 2015-10-13T17:59:00+00:00
  741. Meanwhile in another dimension...
    Smerg Development Journal 2015-10-14T05:55:48+00:00
  742. It's Aliiiiiive...
    daftmike's blog 2015-10-16T07:15:00+00:00
  743. Pumpkin Grove Print Set
    nimasprout - Art by Nicole Gustafsson 2015-10-16T16:28:00+00:00
  744. Teen Who Hacked CIA Director’s Email Tells How He Did...
    Cyberdelia NYC 2015-10-21T17:05:07+00:00
  745. X:Rebirth – Geometric Lensflares
    Simonschreibt. 2015-10-23T19:52:16+00:00
  746. eightninea: Moogfest —
    @mntmn 2015-10-25T13:50:10+00:00
  747. CTC Bizer Duplicator... My new 3D printer
    daftmike's blog 2015-10-27T11:40:00+00:00
  748. Photo
    ♘ 2015-10-27T12:39:03+00:00
  749. there is no problem for which X11 forwarding is the correct solution
    @mntmn 2015-10-28T00:30:02+00:00
  750. this computer has an identity crisis
    @mntmn 2015-10-29T23:04:18+00:00
  751. Infinite disk
    Dan Luu 2015-11-01T00:00:00+00:00
  752. Photo
    @mntmn 2015-11-01T21:48:10+00:00
  753. Little Blue Box - Jobs And Wozniak on Phone Phreaking “Before...
    Cyberdelia NYC 2015-11-05T17:46:11+00:00
  754. Mystery Toronto Artist Gives Payphones a Makeoverincluding a...
    Cyberdelia NYC 2015-11-06T21:09:17+00:00
  755. Getting Optimal Apple ][ Screenshots w/NTSC Emulation
    int10h.org - VileR's blog 2015-11-08T14:04:54+00:00
  756. MIT uses Evennia!
    Evennia Devblog RSS Feed 2015-11-12T00:00:00+00:00
  757. The Road to Alpha, Week 96 - Committing to ...
    Citybound Devblog 2015-11-12T01:12:21+00:00
  758. Fallout 4 – Wasteland Eyes
    Simonschreibt. 2015-11-17T13:52:45+00:00
  759. Happy Little Words
    Steve Losh 2015-11-20T18:43:00+00:00
  760. What's worked in Computer Science: 1999 v. 2015
    Dan Luu 2015-11-23T00:00:00+00:00
  761. Photo
    ♘ 2015-11-23T21:07:53+00:00
  762. What It Was Like When They Filmed Hackers At My High School
    Cyberdelia NYC 2015-11-24T13:41:49+00:00
  763. Why use ECC?
    Dan Luu 2015-11-27T00:00:00+00:00
  764. Trying out Beam.pro
    Citybound Devblog 2015-11-27T01:21:14+00:00
  765. Okay. Let’s go shopping.DADEI’ll hack the...
    Cyberdelia NYC 2015-11-27T15:56:48+00:00
  766. Photo
    ♘ 2015-11-27T17:07:05+00:00
  767. Michael left Citybound
    Citybound Devblog 2015-11-27T22:25:08+00:00
  768. system
    @mntmn 2015-11-28T14:50:11+00:00
  769. Just Beat the Data Out of It
    Steve Losh 2015-11-30T16:10:00+00:00
  770. Spotted in the wild.  “Okay. Let’s go...
    Cyberdelia NYC 2015-11-30T18:30:42+00:00
  771. Hackers Oral History: How Did This Get Made
    Cyberdelia NYC 2015-11-30T21:00:36+00:00
  772. My first 3D design...
    daftmike's blog 2015-12-01T08:36:00+00:00
  773. Famed for Tango and Hackers
    Cyberdelia NYC 2015-12-01T14:02:03+00:00
  774. Photo
    ♘ 2015-12-01T19:05:49+00:00
  775. Photo
    ♘ 2015-12-01T19:07:05+00:00
  776. What is Color Banding? And what is it not?
    Simonschreibt. 2015-12-02T19:00:20+00:00
  777. Photo
    ♘ 2015-12-04T14:49:44+00:00
  778. Braid – Respect the Rules
    Simonschreibt. 2015-12-07T11:24:40+00:00
  779. The winding, telephonic odyssey of Joybubbles, the original phone phreak
    Cyberdelia NYC 2015-12-07T19:24:39+00:00
  780. My BLT drive on my computer just went...
    Cyberdelia NYC 2015-12-08T13:22:16+00:00
  781. Newsletter #6 - Ship it!
    Neovim 2015-12-09T00:00:00+00:00
  782. What the Hell are Permutation Patterns?
    Steve Losh 2015-12-10T19:55:00+00:00
  783. I'm participating in a game jam this weekend
    Citybound Devblog 2015-12-10T23:22:01+00:00
  784. Files are hard
    Dan Luu 2015-12-12T00:00:00+00:00
  785. Ludum Dare 34 Postmortem
    Steve Losh 2015-12-15T16:30:00+00:00
  786. A summary of a year
    Evennia Devblog RSS Feed 2015-12-17T00:00:00+00:00
  787. Big companies v. startups
    Dan Luu 2015-12-17T00:00:00+00:00
  788. BigBlue Terminal: An Oldschool Fixed-Width Pixel Font
    int10h.org - VileR's blog 2015-12-18T16:51:40+00:00
  789. NES Graphics – Part 3
    dustmop.io blog 2015-12-18T18:00:31+00:00
  790. What RESTful actually means
    Code Words 2015-12-19T09:00:00+00:00
  791. Fallout 4 – The Mushroom Case
    Simonschreibt. 2015-12-23T01:46:27+00:00
  792. How to trick a neural network into thinking a panda is a vulture
    Code Words 2015-12-23T09:00:00+00:00
  793. cyberdelianyc: What, your mom buy you a ‘Puter for Christmas?...
    Cyberdelia NYC 2015-12-25T13:01:38+00:00
  794. Normalization of deviance
    Dan Luu 2015-12-29T00:00:00+00:00
  795. New Year prediction…(for 1996?)Kate: RISC architecture is...
    Cyberdelia NYC 2016-01-04T20:34:38+00:00
  796. Solo show premiering at Gallery 1988 (East)
    nimasprout - Art by Nicole Gustafsson 2016-01-05T05:12:00+00:00
  797. LinkNYC public Wi-Fi Finally Getting Installed New York is...
    Cyberdelia NYC 2016-01-05T13:57:18+00:00
  798. Delayed Reference Method
    Simonschreibt. 2016-01-05T16:47:05+00:00
  799. A Promising 2016
    Citybound Devblog 2016-01-06T23:35:09+00:00
  800. Windows (and ClearType) vs. Truetype Fonts with Embedded Bitmaps
    int10h.org - VileR's blog 2016-01-07T21:11:23+00:00
  801. I make a Craft(Friends) check...
    Smerg Development Journal 2016-01-08T17:02:11+00:00
  802. We saw some really bad Intel CPU bugs in 2015 and we should expect to see more in the future
    Dan Luu 2016-01-10T00:00:00+00:00
  803. Photo
    Cyberdelia NYC 2016-01-11T01:58:45+00:00
  804. Banned of Brothers
    Smerg Development Journal 2016-01-12T17:11:52+00:00
  805. Diablo 3 – The sacred spiderweb
    Simonschreibt. 2016-01-14T18:46:35+00:00
  806. The Once and Future Weird Kids at Gallery 1988
    nimasprout - Art by Nicole Gustafsson 2016-01-16T20:28:00+00:00
  807. The Ultimate Oldschool PC Font Pack (v1.0)
    int10h.org - VileR's blog 2016-01-16T22:07:06+00:00
  808. Experiments with toner transfer...
    daftmike's blog 2016-01-19T14:00:00+00:00
  809. Alpha 1 – My Top 5 Usecases
    Simonschreibt. 2016-01-22T15:31:12+00:00
  810. Sampling v. tracing
    Dan Luu 2016-01-24T00:00:00+00:00
  811. Niagara calls
    How to Spot a Psychopath 2016-01-30T05:38:21+00:00
  812. Bill Gates Hacked His High School’s Computers to Be Placed in...
    Cyberdelia NYC 2016-02-01T21:02:57+00:00
  813. After just two years, I'm starting properly!
    Citybound Devblog 2016-02-01T22:03:26+00:00
  814. Blubb! – Fish Tanks in Games
    Simonschreibt. 2016-02-02T21:09:12+00:00
  815. Scanning for confidential information on external web servers
    The Grymoire 2016-02-06T16:50:53+00:00
  816. Diablo 3 – Wings of Angels
    Simonschreibt. 2016-02-11T13:44:24+00:00
  817. Photo
    ♘ 2016-02-11T17:57:59+00:00
  818. Climbing up Branches
    Evennia Devblog RSS Feed 2016-02-14T00:00:00+00:00
  819. The long path of player generation
    Smerg Development Journal 2016-02-15T09:22:47+00:00
  820. A monumental day
    Smerg Development Journal 2016-02-19T02:10:36+00:00
  821. Terrain Generation with Midpoint Displacement
    Steve Losh 2016-02-19T19:45:00+00:00
  822. Dark Maus – Top Down Trees
    Simonschreibt. 2016-02-24T16:06:04+00:00
  823. So-called "IBM" Freeware Games from the Early '80s
    int10h.org - VileR's blog 2016-02-26T09:02:11+00:00
  824. Harry Potter and the Methods of Rationality review by su3su2u1
    Dan Luu 2016-03-01T00:00:00+00:00
  825. su3su2u1 physics tumblr archive
    Dan Luu 2016-03-01T00:00:00+00:00
  826. v1.2! never stop! #amiga
    @mntmn 2016-03-01T18:55:01+00:00
  827. Alien vs Wolfenstein – Cutting Torch
    Simonschreibt. 2016-03-02T20:01:25+00:00
  828. This just isn't functional
    Code Words 2016-03-07T12:00:00+00:00
  829. Recursive Midpoint Displacement
    Steve Losh 2016-03-07T13:45:00+00:00
  830. Image Processing 101
    Code Words 2016-03-10T09:00:00+00:00
  831. Lotus Text
    dustmop.io blog 2016-03-10T17:43:10+00:00
  832. Telling stories with data using the grammar of graphics
    Code Words 2016-03-16T10:00:00+00:00
  833. A Music Update From Dane
    Citybound Devblog 2016-03-18T13:26:04+00:00
  834. We only hire the trendiest
    Dan Luu 2016-03-21T07:23:44+00:00
  835. Olympiad: IBM Prototype Fonts Unearthed
    int10h.org - VileR's blog 2016-03-22T22:36:46+00:00
  836. Technical stuff happening
    Evennia Devblog RSS Feed 2016-03-24T00:00:00+00:00
  837. Immutability is not enough
    Code Words 2016-03-29T10:00:00+00:00
  838. Thermoelectric Drinks-Can Cooler
    daftmike's blog 2016-03-30T18:22:00+00:00
  839. Now in German: Eine kleine Statusberichterstattung
    Citybound Devblog 2016-04-01T22:06:02+00:00
  840. April Fools!
    Citybound Devblog 2016-04-02T20:56:50+00:00
  841. Google SRE book
    Dan Luu 2016-04-11T08:00:58+00:00
  842. How I'm getting along
    Citybound Devblog 2016-04-18T00:17:49+00:00
  843. Some programming blogs to consider reading
    Dan Luu 2016-04-18T07:06:34+00:00
  844. The Secrets of Medieval Fonts
    medievalbooks 2016-04-29T10:48:27+00:00
  845. "Celestial Spaces" opening at Flatcolor Gallery
    nimasprout - Art by Nicole Gustafsson 2016-05-02T16:08:00+00:00
  846. Cron best practices
    Arabesque 2016-05-08T05:19:19+00:00
  847. Evennia 0.6!
    Evennia Devblog RSS Feed 2016-05-22T00:00:00+00:00
  848. Dopefish goes NTSC: Commander Keen 4 Composite CGA Patch Notes
    int10h.org - VileR's blog 2016-05-28T23:40:51+00:00
  849. Background: A Tale of Two Worlds
    Citybound Devblog 2016-05-29T19:11:38+00:00
  850. Evennia in Pictures
    Evennia Devblog RSS Feed 2016-05-31T00:00:00+00:00
  851. The Wit.nes (demo)
    dustmop.io blog 2016-06-03T17:01:56+00:00
  852. Shifts in the blogging tide
    Article on Coyote Cartography 2016-06-19T00:41:39+00:00
  853. Terrain Generation with Diamond Square
    Steve Losh 2016-06-27T13:35:00+00:00
  854. What the Hell is Symbolic Computation?
    Steve Losh 2016-06-29T13:30:00+00:00
  855. The art of sharing nicks and descriptions
    Evennia Devblog RSS Feed 2016-07-01T00:00:00+00:00
  856. The Joy of VFX – Pintable
    Simonschreibt. 2016-07-10T18:47:19+00:00
  857. Background: An Architecture for Millions of Things
    Citybound Devblog 2016-07-13T20:37:25+00:00
  858. Yet another 16-color CGA makeover: Keen 5
    int10h.org - VileR's blog 2016-07-17T23:08:51+00:00
  859. Keen 4 Mystery Code Demystified
    int10h.org - VileR's blog 2016-07-17T23:32:37+00:00
  860. Adventures in SSL
    Smerg Development Journal 2016-07-23T21:53:33+00:00
  861. NESPi - my Mini NES Classic Raspberry Pi games console
    daftmike's blog 2016-07-27T20:00:00+00:00
  862. Mini NES Classic Updates
    daftmike's blog 2016-08-01T15:42:00+00:00
  863. Photo
    ♘ 2016-08-04T08:19:59+00:00
  864. Slides: Demystifying Demakes
    dustmop.io blog 2016-08-04T19:48:37+00:00
  865. Notes on concurrency bugs
    Dan Luu 2016-08-05T03:32:26+00:00
  866. August 2016 Lisp Game Jam Postmortem
    Steve Losh 2016-08-15T13:45:00+00:00
  867. Happy 35th birthday, IBM PC!
    int10h.org - VileR's blog 2016-08-18T13:28:29+00:00
  868. Look at a computer chip up close and it almost looks like an...
    Cyberdelia NYC 2016-08-18T20:04:48+00:00
  869. Playing With Syntax
    Steve Losh 2016-08-19T13:15:00+00:00
  870. The Elegance of Deflate
    codersnotes.com 2016-08-21T07:00:00+00:00
  871. The Multi-Project Programmer
    codersnotes.com 2016-08-26T07:00:00+00:00
  872. The Metaprogrammer
    codersnotes.com 2016-09-06T07:00:00+00:00
  873. Learning To Wrangle Half-Floats
    codersnotes.com 2016-09-10T07:00:00+00:00
  874. How I learned to program
    Dan Luu 2016-09-12T08:41:26+00:00
  875. Debunking Euclideon's Unlimited Detail Tech
    codersnotes.com 2016-09-13T07:00:00+00:00
  876. Celebrating 21 years
    Cyberdelia NYC 2016-09-15T15:13:14+00:00
  877. the 7th hacker
    Cyberdelia NYC 2016-09-15T15:16:04+00:00
  878. Customizing Common Lisp's Iterate: Averaging
    Steve Losh 2016-09-20T13:45:00+00:00
  879. Weekly Programming Challenge #9
    The Buckblog 2016-09-24T06:00:00+00:00
  880. The last weeks were mostly spent with improving the tools and...
    DeathTrash 2016-09-24T11:17:52+00:00
  881. Is dev compensation bimodal?
    Dan Luu 2016-09-27T06:33:26+00:00
  882. More characters. (Little diversion from all that tools...
    DeathTrash 2016-09-29T12:26:08+00:00
  883. Weekly Programming Challenge #10
    The Buckblog 2016-10-01T06:00:00+00:00
  884. Untonemapping, and other stupid tricks
    codersnotes.com 2016-10-02T07:00:00+00:00
  885. I could do that in a weekend!
    Dan Luu 2016-10-03T08:14:27+00:00
  886. The Challenge Of Making Things
    codersnotes.com 2016-10-04T07:00:00+00:00
  887. Weekly Programming Challenge #11
    The Buckblog 2016-10-08T06:00:00+00:00
  888. All that work on the Level Editor is finally paying off and...
    DeathTrash 2016-10-09T07:04:35+00:00
  889. Hiring and the market for lemons
    Dan Luu 2016-10-09T09:44:14+00:00
  890. Customizing Common Lisp's Iterate: Timing
    Steve Losh 2016-10-10T14:50:00+00:00
  891. Data driven literary analysis
    Code Words 2016-10-11T12:00:00+00:00
  892. A tour of random forests
    Code Words 2016-10-11T12:00:00+00:00
  893. A history of storage media
    Code Words 2016-10-11T12:00:00+00:00
  894. Season of fixes
    Evennia Devblog RSS Feed 2016-10-13T00:00:00+00:00
  895. Weekly Programming Challenge #12
    The Buckblog 2016-10-15T06:00:00+00:00
  896. The Illusion Of Controls
    codersnotes.com 2016-10-15T07:00:00+00:00
  897. Programming book recommendations and anti-recommendations
    Dan Luu 2016-10-16T08:06:34+00:00
  898. “How’s it hanging?”Animations halfway done on this one. Need...
    DeathTrash 2016-10-17T17:00:54+00:00
  899. New Paintings Featuring Spell Cats
    nimasprout - Art by Nicole Gustafsson 2016-10-18T17:19:00+00:00
  900. UI improvements.
    DeathTrash 2016-10-19T10:00:55+00:00
  901. Custom commands
    Arabesque 2016-10-22T05:37:44+00:00
  902. Weekly Programming Challenge #13
    The Buckblog 2016-10-22T06:00:00+00:00
  903. HN: the good parts
    Dan Luu 2016-10-23T00:00:00+00:00
  904. Weekly Programming Challenge #14
    The Buckblog 2016-10-29T06:00:00+00:00
  905. Newsletter #7 - Summer of Road
    Neovim 2016-11-01T00:00:00+00:00
  906. Stormy Weather Arts Festival at Cannon Beach, Oregon.
    nimasprout - Art by Nicole Gustafsson 2016-11-04T14:00:00+00:00
  907. Weekly Programming Challenge #15
    The Buckblog 2016-11-05T06:00:00+00:00
  908. Working on combat this week and it’s beginning to feel a lot...
    DeathTrash 2016-11-05T12:35:00+00:00
  909. Photo
    ♘ 2016-11-08T13:19:31+00:00
  910. Weekly Programming Challenge #16
    The Buckblog 2016-11-12T07:00:00+00:00
  911. "There is another challenge we must address – and it is the corrupting force of the vast sums of..."
    LESSIG Blog 2016-11-13T06:24:47+00:00
  912. Help me express the relative presidential voting power?
    LESSIG Blog 2016-11-13T07:08:57+00:00
  913. Worked on a lot of gameplay related things in the last week. For...
    DeathTrash 2016-11-13T17:32:10+00:00
  914. https://soundcloud.com/lessig/epstein-and-lessig-on-public-fundin...
    LESSIG Blog 2016-11-14T06:18:49+00:00
  915. One person, one vote? yea, right. The corruption that is the Electoral College
    LESSIG Blog 2016-11-17T06:00:32+00:00
  916. Is there a “who’s here” app?
    LESSIG Blog 2016-11-17T09:21:37+00:00
  917. Worked mostly on the framework this week. It’s now at a...
    DeathTrash 2016-11-18T16:51:46+00:00
  918. Weekly Programming Challenges -- Recap
    The Buckblog 2016-11-19T07:00:00+00:00
  919. Beating The Compiler
    codersnotes.com 2016-11-28T08:00:00+00:00
  920. Birthday retrospective
    Evennia Devblog RSS Feed 2016-11-30T00:00:00+00:00
  921. People, not acres, should count in a democracy. (And please...
    LESSIG Blog 2016-11-30T09:03:18+00:00
  922. Birthday retrospective
    Griatch's Evennia musings 2016-11-30T14:38:00+00:00
  923. Sneaking around.
    DeathTrash 2016-12-02T12:24:51+00:00
  924. Learning Via Bullshit
    codersnotes.com 2016-12-06T08:00:00+00:00
  925. Bubbles, Baseball, and Mr. Marsh
    Article on Coyote Cartography 2016-12-07T20:15:48+00:00
  926. So I’ve had my first “zero-carbon-footprint-you” threat
    LESSIG Blog 2016-12-10T02:35:08+00:00
  927. Assassin’s Creed: Black Flag – Waterplane
    Simonschreibt. 2016-12-14T08:55:42+00:00
  928. Form over frolic: Jony Ive’s quest for boring perfection
    Article on Coyote Cartography 2016-12-16T18:30:52+00:00
  929. Converging Towards Disneyland
    codersnotes.com 2016-12-19T08:00:00+00:00
  930. CHIP-8 in Common Lisp: The CPU
    Steve Losh 2016-12-19T17:45:00+00:00
  931. Point of view.
    DeathTrash 2016-12-20T15:01:10+00:00
  932. CHIP-8 in Common Lisp: Graphics
    Steve Losh 2016-12-21T16:55:00+00:00
  933. Process Roulette
    The Buckblog 2016-12-23T07:00:00+00:00
  934. CHIP-8 in Common Lisp: Input
    Steve Losh 2016-12-23T16:00:00+00:00
  935. Christmas 2016 Announcement
    Citybound Devblog 2016-12-25T01:54:31+00:00
  936. CHIP-8 in Common Lisp: Sound
    Steve Losh 2016-12-26T17:30:00+00:00
  937. It was cast in stone and iron so that it could not further...
    DeathTrash 2016-12-27T21:57:11+00:00
  938. Game #63: Dragon Slayer: The Legend of Heroes (TurboGrafx-CD) - It's In Your Hands Now (Finished)
    The RPG Consoler 2016-12-30T00:06:00+00:00
  939. Game #64: Exile (TurboGrafx-CD) - Another Arabian Night (Finished)
    The RPG Consoler 2017-01-01T02:34:00+00:00
  940. Below the Cut: Spiritual Warfare (NES, Genesis, Game Boy)
    The RPG Consoler 2017-01-02T08:08:00+00:00
  941. CHIP-8 in Common Lisp: Disassembly
    Steve Losh 2017-01-02T17:15:00+00:00
  942. Game #65: Soul Blazer (SNES) - Restore the World (Finished)
    The RPG Consoler 2017-01-03T02:18:00+00:00
  943. CHIP-8 in Common Lisp: Debugging Infrastructure
    Steve Losh 2017-01-05T16:40:00+00:00
  944. Rest Well 1992; Welcome 1993
    The RPG Consoler 2017-01-07T21:19:00+00:00
  945. What I did
    Citybound Devblog 2017-01-09T20:19:37+00:00
  946. Game #66: Ultima: Warriors of Destiny (NES) - Promised Destiny
    The RPG Consoler 2017-01-10T04:23:00+00:00
  947. CHIP-8 in Common Lisp: Menus
    Steve Losh 2017-01-10T16:20:00+00:00
  948. What I did
    Citybound Devblog 2017-01-10T16:21:19+00:00
  949. Building a Teensy 3.2 w/SD and 8 position DIP switch + Reset button
    The Grymoire 2017-01-11T20:02:03+00:00
  950. What I did
    Citybound Devblog 2017-01-11T22:17:38+00:00
  951. Disassembling Jak & Daxter
    codersnotes.com 2017-01-12T08:00:00+00:00
  952. What I did
    Citybound Devblog 2017-01-12T21:37:12+00:00
  953. What happened
    Citybound Devblog 2017-01-13T22:47:05+00:00
  954. What I did
    Citybound Devblog 2017-01-16T22:21:15+00:00
  955. Game #66: Ultima: Warriors of Destiny (NES) - Warriors Rushed (Finished)
    The RPG Consoler 2017-01-17T04:51:00+00:00
  956. What I did
    Citybound Devblog 2017-01-18T22:36:48+00:00
  957. What I did
    Citybound Devblog 2017-01-19T17:41:32+00:00
  958. What I did
    Citybound Devblog 2017-01-20T16:33:51+00:00
  959. What I did
    Citybound Devblog 2017-01-23T22:03:48+00:00
  960. Game #67: Gauntlet IV (Genesis) - Dragons All The Way Up (Finished)
    The RPG Consoler 2017-01-24T04:15:00+00:00
  961. What I did
    Citybound Devblog 2017-01-25T22:09:04+00:00
  962. What I did
    Citybound Devblog 2017-01-27T22:11:45+00:00
  963. Worked mostly on tools in the past weeks. They are now at an...
    DeathTrash 2017-01-28T23:33:35+00:00
  964. What I did
    Citybound Devblog 2017-01-30T21:18:39+00:00
  965. Below the Cut: LandStalker (Genesis)
    The RPG Consoler 2017-01-31T16:21:00+00:00
  966. What I did
    Citybound Devblog 2017-01-31T22:16:03+00:00
  967. Release: Citybound 0.1.1 & 0.1.2
    Citybound Devblog 2017-02-02T00:08:26+00:00
  968. News items from the new year
    Evennia Devblog RSS Feed 2017-02-05T00:00:00+00:00
  969. News items from the new year
    Griatch's Evennia musings 2017-02-05T12:21:00+00:00
  970. January '17 Review & February Plans
    Citybound Devblog 2017-02-07T19:55:43+00:00
  971. How web bloat impacts users with slow connections
    Dan Luu 2017-02-08T00:00:00+00:00
  972. Bash hostname completion
    Arabesque 2017-02-10T10:32:17+00:00
  973. “Nope.”
    DeathTrash 2017-02-10T12:42:44+00:00
  974. What I thought about
    Citybound Devblog 2017-02-14T19:04:18+00:00
  975. Building NES homebrew with makechr.exe
    dustmop.io blog 2017-02-16T18:53:56+00:00
  976. What I thought about
    Citybound Devblog 2017-02-17T18:49:08+00:00
  977. Shell from vi
    Arabesque 2017-02-18T10:46:56+00:00
  978. Mother is waiting.
    DeathTrash 2017-02-19T07:32:05+00:00
  979. What I did
    Citybound Devblog 2017-02-22T22:08:34+00:00
  980. Dragon taming with Tailbiter, a bytecode compiler for Python
    Code Words 2017-02-23T12:00:00+00:00
  981. Game #68: Inindo: Way of the Ninja (SNES) - The Wrong Way
    The RPG Consoler 2017-02-23T16:52:00+00:00
  982. Turns out it's difficult
    Citybound Devblog 2017-02-27T02:08:39+00:00
  983. Eureka?
    Citybound Devblog 2017-02-28T23:25:43+00:00
  984. Game #68: Inindo: Way of the Ninja (SNES) - Gearing Up
    The RPG Consoler 2017-03-02T05:00:00+00:00
  985. Down the Rabbit Hole
    Citybound Devblog 2017-03-07T22:56:20+00:00
  986. Beautiful new words to describe obscure emotions
    The Dictionary of Obscure Sorrows 2017-03-10T16:29:27+00:00
  987. Let the Battle Begin
    Citybound Devblog 2017-03-14T11:05:29+00:00
  988. A Glimpse of Game and Interaction Design in Citybound
    Citybound Devblog 2017-03-14T22:28:20+00:00
  989. What I did the last days
    Citybound Devblog 2017-03-19T23:42:50+00:00
  990. LetsEncrypt + Amazon EC2 = SSLLabs A Rating
    The Grymoire 2017-03-24T15:20:28+00:00
  991. Moving countries again...
    Citybound Devblog 2017-03-27T22:23:36+00:00
  992. With every click of the shutter,you’re trying to press...
    The Dictionary of Obscure Sorrows 2017-03-28T14:05:11+00:00
  993. Water shader.
    DeathTrash 2017-04-01T15:45:50+00:00
  994. What I did the last days
    Citybound Devblog 2017-04-04T07:52:39+00:00
  995. What I did today
    Citybound Devblog 2017-04-07T00:09:35+00:00
  996. Game #68: Inindo: Way of the Ninja (SNES) - Wandering Away (Finished)
    The RPG Consoler 2017-04-10T17:50:00+00:00
  997. Below the Cut: Technoclash (Genesis)
    The RPG Consoler 2017-04-16T21:04:00+00:00
  998. The origins of XXX as FIXME
    Juho Snellman's Weblog 2017-04-17T18:00:00+00:00
  999. CTIA v. Berkeley: affirmed
    LESSIG Blog 2017-04-21T16:44:08+00:00
  1000. The luxury of a creative community
    Evennia Devblog RSS Feed 2017-04-23T00:00:00+00:00
  1001. The luxury of a creative community
    Griatch's Evennia musings 2017-04-23T20:46:00+00:00
  1002. Towards Simplicity & Actual Realism
    Citybound Devblog 2017-04-25T00:00:00+00:00
  1003. Filthy Kitchen
    dustmop.io blog 2017-04-27T17:51:41+00:00
  1004. Introducing Stagemaster
    Citybound Devblog 2017-05-04T00:00:00+00:00
  1005. Game #69: Super Ninja Boy (SNES) - Dragon Ball Gaiden (Finished)
    The RPG Consoler 2017-05-07T18:35:00+00:00
  1006. The Pain Of Linear Types In Rust - Faultlore
    Faultlore 2017-05-08T00:00:00+00:00
  1007. A Day in a Table
    Citybound Devblog 2017-05-10T00:00:00+00:00
  1008. Two Tiny Triumphs
    Citybound Devblog 2017-05-13T00:00:00+00:00
  1009. Economy Sculpting
    Citybound Devblog 2017-05-22T00:00:00+00:00
  1010. Let's do more together
    Citybound Devblog 2017-06-04T00:00:00+00:00
  1011. Startup options v. cash
    Dan Luu 2017-06-07T00:00:00+00:00
  1012. Stylized VFX in RIME
    Simonschreibt. 2017-06-07T12:56:28+00:00
  1013. The widely cited studies on mouse vs. keyboard efficiency are completely bogus
    Dan Luu 2017-06-13T00:00:00+00:00
  1014. Bugs You'll Probably Only Have In Rust - Faultlore
    Faultlore 2017-06-14T00:00:00+00:00
  1015. The First "Patrons Calling" & How it Went
    Citybound Devblog 2017-06-15T00:00:00+00:00
  1016. Interviewing My Top Patron (and a small status update)
    Citybound Devblog 2017-06-20T00:00:00+00:00
  1017. Game #70: Great Greed (Game Boy) - The Road to Hell (Finished)
    The RPG Consoler 2017-07-06T06:08:00+00:00
  1018. 868-HACK: PLAN.B
    Mighty Vision 2017-07-11T09:48:00+00:00
  1019. 868-HACK: PLAN.B
    Mighty Vision 2017-07-11T09:48:00+00:00
  1020. Opening Up Patrons Calling, Join the 3rd!
    Citybound Devblog 2017-07-14T00:00:00+00:00
  1021. Terminal latency
    Dan Luu 2017-07-18T00:00:00+00:00
  1022. Game #71: Might and Magic III: Isles of Terra (SNES) - A Glitch in Timing
    The RPG Consoler 2017-07-18T04:19:00+00:00
  1023. I don't want no 'wantarray'
    Juho Snellman's Weblog 2017-07-18T18:00:00+00:00
  1024. The mystery of the hanging S3 downloads
    Juho Snellman's Weblog 2017-07-20T16:00:00+00:00
  1025. plan.b notes
    Mighty Vision 2017-07-24T13:22:00+00:00
  1026. plan.b notes
    Mighty Vision 2017-07-24T13:22:00+00:00
  1027. Game #71: Might and Magic III: Isles of Terra (SNES) - A Port for Whining
    The RPG Consoler 2017-07-26T01:43:00+00:00
  1028. Welcome to pizzabox.computer
    Pizza Box Computer 2017-07-31T01:01:38+00:00
  1029. Game #71: Might and Magic III: Isles of Terra (SNES) - A Note with Rhyming
    The RPG Consoler 2017-08-02T03:07:00+00:00
  1030. Sneak Peak: The Open Design Doc
    Citybound Devblog 2017-08-04T00:00:00+00:00
  1031. Enhanced vision, not coordination
    Romain Laurent 2017-08-05T18:09:21+00:00
  1032. Game Engine Black Book ReleaseDate
    Fabien Sanglard 2017-08-07T01:08:45+00:00
  1033. Game #71: Might and Magic III: Isles of Terra (SNES) - A Key Divining
    The RPG Consoler 2017-08-08T02:00:00+00:00
  1034. Sattolo's algorithm
    Dan Luu 2017-08-09T00:00:00+00:00
  1035. Game #71: Might and Magic III: Isles of Terra (SNES) - A Game Unwinding (Finished)
    The RPG Consoler 2017-08-14T15:47:00+00:00
  1036. What Is a Workstation
    Pizza Box Computer 2017-08-18T01:30:00+00:00
  1037. Why PS4 downloads are so slow
    Juho Snellman's Weblog 2017-08-19T19:00:00+00:00
  1038. 3 Days of Citybound
    Citybound Devblog 2017-08-20T00:00:00+00:00
  1039. The Dream
    Pizza Box Computer 2017-08-20T18:52:04+00:00
  1040. Branch prediction
    Dan Luu 2017-08-23T00:00:00+00:00
  1041. Digital VAXstation 4000 VLC
    Pizza Box Computer 2017-08-24T00:30:00+00:00
  1042. NeXTstation mono
    Pizza Box Computer 2017-08-24T00:30:00+00:00
  1043. Silicon Graphics Indy
    Pizza Box Computer 2017-08-24T00:30:00+00:00
  1044. Sun SPARCstation 1+
    Pizza Box Computer 2017-08-24T00:30:00+00:00
  1045. The First Four Pizzaboxes
    Pizza Box Computer 2017-08-24T00:30:00+00:00
  1046. Renaming Django's Auth User and App
    Evennia Devblog RSS Feed 2017-08-25T00:00:00+00:00
  1047. Renaming Django's Auth User and App
    Griatch's Evennia musings 2017-08-25T21:22:00+00:00
  1048. FizzleFade
    Fabien Sanglard 2017-08-28T01:08:45+00:00
  1049. Current state of the game.
    DeathTrash 2017-08-30T09:28:19+00:00
  1050. Indy Power Supply Replacement
    Pizza Box Computer 2017-09-02T20:20:00+00:00
  1051. The Danger Of Opinions
    codersnotes.com 2017-09-03T07:00:00+00:00
  1052. Numbers and tagged pointers in early Lisp implementations
    Juho Snellman's Weblog 2017-09-04T15:00:00+00:00
  1053. Cubic Liters
    Romain Laurent 2017-09-05T22:34:47+00:00
  1054. Game Engine Black Book Postmortem
    Fabien Sanglard 2017-09-07T01:08:45+00:00
  1055. Why Command And Vector Processors Rock
    codersnotes.com 2017-09-07T07:00:00+00:00
  1056. Booting the Indy
    Pizza Box Computer 2017-09-10T02:20:00+00:00
  1057. I DEMAND Bassel Khartabil’s DATE OF DEATH and REMAINS...
    LESSIG Blog 2017-09-15T20:30:00+00:00
  1058. HP 9000 Model 712/60
    Pizza Box Computer 2017-09-19T01:00:00+00:00
  1059. Macintosh Quadra 605
    Pizza Box Computer 2017-09-19T01:00:00+00:00
  1060. Macintosh Quadra 610
    Pizza Box Computer 2017-09-19T01:00:00+00:00
  1061. Power Macintosh 6100/60
    Pizza Box Computer 2017-09-19T01:00:00+00:00
  1062. Two Macs and an HP
    Pizza Box Computer 2017-09-19T01:00:00+00:00
  1063. Evennia 0.7 released
    Evennia Devblog RSS Feed 2017-09-20T00:00:00+00:00
  1064. Evennia 0.7 released
    Griatch's Evennia musings 2017-09-20T20:44:00+00:00
  1065. streak scoring redux
    Mighty Vision 2017-09-21T13:36:00+00:00
  1066. streak scoring redux
    Mighty Vision 2017-09-21T13:36:00+00:00
  1067. Surprisingly Networking
    Citybound Devblog 2017-09-25T00:00:00+00:00
  1068. Getting an Indy Desktop
    Pizza Box Computer 2017-09-26T01:00:00+00:00
  1069. Cool Stuff with Textures
    Simonschreibt. 2017-09-27T08:00:57+00:00
  1070. SNESPi - 3D Printed Raspberry Pi Mini SNES(s)
    daftmike's blog 2017-09-28T02:51:00+00:00
  1071. Evennia in Hacktobergest 2017
    Evennia Devblog RSS Feed 2017-10-01T00:00:00+00:00
  1072. Evennia in Hacktoberfest 2017
    Griatch's Evennia musings 2017-10-01T20:05:00+00:00
  1073. RustFest Recap & Patron's Calling this Friday
    Citybound Devblog 2017-10-02T00:00:00+00:00
  1074. Game #72: Ninja Boy 2 (Game Boy) - Ninjas vs. Pirates, In Space! (Finished)
    The RPG Consoler 2017-10-06T06:20:00+00:00
  1075. Between a rock and a hard place
    Romain Laurent 2017-10-07T20:14:35+00:00
  1076. The 5th Patrons Calling, Voting for the 6th & Thoughts
    Citybound Devblog 2017-10-09T00:00:00+00:00
  1077. This is why I became an Engineer
    Terrible Banana 2017-10-09T00:44:13+00:00
  1078. Little Lightmap Tricks
    codersnotes.com 2017-10-10T07:00:00+00:00
  1079. A tool to get your copyrights back
    LESSIG Blog 2017-10-12T10:15:39+00:00
  1080. Below the Cut: Dungeon Explorer II (TurboGrafx-CD)
    The RPG Consoler 2017-10-15T18:14:00+00:00
  1081. Keyboard latency
    Dan Luu 2017-10-16T00:00:00+00:00
  1082. 6th Patrons Calling will be on Sun 22nd 5PM CEST
    Citybound Devblog 2017-10-18T00:00:00+00:00
  1083. My RustFest Talk (With Networking Demo)
    Citybound Devblog 2017-10-19T00:00:00+00:00
  1084. Filesystem error handling
    Dan Luu 2017-10-23T00:00:00+00:00
  1085. Something Rotten In The Core
    codersnotes.com 2017-10-24T07:00:00+00:00
  1086. Getting a MUD RP scene going
    Evennia Devblog RSS Feed 2017-10-29T00:00:00+00:00
  1087. Replacing a dead NVRAM chip
    Pizza Box Computer 2017-10-29T00:00:00+00:00
  1088. Getting a MUD Roleplaying Scene going
    Griatch's Evennia musings 2017-10-29T16:01:00+00:00
  1089. October 2017 Prototype Release
    Citybound Devblog 2017-11-03T00:00:00+00:00
  1090. HOME :: Pins & Patches :: LAPEL PINS :: Dick Banana
    Terrible Banana 2017-11-04T00:56:15+00:00
  1091. Just fuck right off
    Terrible Banana 2017-11-06T16:32:15+00:00
  1092. UI backwards compatibility
    Dan Luu 2017-11-09T00:00:00+00:00
  1093. How out of date are Android devices?
    Dan Luu 2017-11-12T00:00:00+00:00
  1094. How good are decisions? Evaluating decision quality in domains where evaluation is easy
    Dan Luu 2017-11-21T00:00:00+00:00
  1095. Booting the SPARCstation
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1096. Data General AViiON AV/300D
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1097. DEC 3000 300X
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1098. Digital AlphaStation 200 4/233
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1099. Digital Multia
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1100. IBM RS/6000 POWERstation Model 250
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1101. Three new RISC boxes
    Pizza Box Computer 2017-11-21T21:00:00+00:00
  1102. November 2017 Prototype Release
    Citybound Devblog 2017-12-06T00:00:00+00:00
  1103. Newsletter #8 - Open up the Windows
    Neovim 2017-12-16T00:00:00+00:00
  1104. Computer latency: 1977-2017
    Dan Luu 2017-12-24T00:00:00+00:00
  1105. untitled
    Mighty Vision 2017-12-24T18:17:00+00:00
  1106. untitled
    Mighty Vision 2017-12-24T18:17:00+00:00
  1107. Cinco Paus - dev notes
    Mighty Vision 2017-12-27T15:13:00+00:00
  1108. Cinco Paus - dev notes
    Mighty Vision 2017-12-27T15:13:00+00:00
  1109. New year, new stuff
    Evennia Devblog RSS Feed 2018-01-05T00:00:00+00:00
  1110. New year, new stuff
    Griatch's Evennia musings 2018-01-05T10:29:00+00:00
  1111. In Search Of The Lost Program
    codersnotes.com 2018-01-11T08:00:00+00:00
  1112. Self awareness
    Romain Laurent 2018-01-15T12:40:13+00:00
  1113. untitled
    Mighty Vision 2018-01-17T18:19:00+00:00
  1114. untitled
    Mighty Vision 2018-01-17T18:19:00+00:00
  1115. Metasploit+Amazon SES, or debugging Sendmail’s SMTP Authentication
    The Grymoire 2018-01-17T19:29:32+00:00
  1116. Bus Pirate Cables – which is the best?
    The Grymoire 2018-01-18T14:38:44+00:00
  1117. Game #73: Dungeon Master (SNES) - Dungeon Meat of Doom!
    The RPG Consoler 2018-01-22T03:10:00+00:00
  1118. Kicking into gear from a distance
    Evennia Devblog RSS Feed 2018-01-27T00:00:00+00:00
  1119. Kicking into gear from a distance
    Griatch's Evennia musings 2018-01-27T22:27:00+00:00
  1120. How I spent December and January
    Citybound Devblog 2018-02-04T00:00:00+00:00
  1121. 7th Patrons Calling will be on Sun 11th 5PM CET
    Citybound Devblog 2018-02-09T00:00:00+00:00
  1122. A PowerMac Surprise!
    Pizza Box Computer 2018-02-14T02:00:00+00:00
  1123. Unifying Road & Zoning UI
    Citybound Devblog 2018-03-11T00:00:00+00:00
  1124. scabulous
    The Dictionary of Obscure Sorrows 2018-03-12T13:06:00+00:00
  1125. Altschmerz
    The Dictionary of Obscure Sorrows 2018-03-12T13:19:30+00:00
  1126. onism
    The Dictionary of Obscure Sorrows 2018-03-12T13:21:56+00:00
  1127. March '18 Hackathon
    Citybound Devblog 2018-03-17T00:00:00+00:00
  1128. Booting the Multia
    Pizza Box Computer 2018-03-17T21:00:00+00:00
  1129. Conclusions about the Hackathon & Livestreaming
    Citybound Devblog 2018-03-27T00:00:00+00:00
  1130. Fsyncgate: errors on fsync are unrecovarable
    Dan Luu 2018-03-28T00:00:00+00:00
  1131. Steel Survivor: an IBM XT Tale
    int10h.org - VileR's blog 2018-03-29T19:58:37+00:00
  1132. LED Matrix Animation Frame
    daftmike's blog 2018-03-31T15:53:00+00:00
  1133. Phlogiston preview
    Mighty Vision 2018-04-04T12:21:00+00:00
  1134. Phlogiston preview
    Mighty Vision 2018-04-04T12:21:00+00:00
  1135. 10 Minute Mod: GameBoy Screen Rot Fix(?)
    daftmike's blog 2018-04-08T22:17:00+00:00
  1136. A Dusting of Gamification
    Joel on Software 2018-04-13T13:40:21+00:00
  1137. What the Zoning Prototype Will Bring
    Citybound Devblog 2018-04-22T00:00:00+00:00
  1138. Strange and maddening rules
    Joel on Software 2018-04-23T14:42:45+00:00
  1139. Small update
    DeathTrash 2018-04-23T17:23:08+00:00
  1140. The Giffinator – Technical Details
    dustmop.io blog 2018-04-26T18:59:53+00:00
  1141. Booting the HP 712
    Pizza Box Computer 2018-04-28T22:00:00+00:00
  1142. Stylized VFX in RIME – Water Edition
    Simonschreibt. 2018-05-01T11:18:30+00:00
  1143. Announcing Stack Overflow for Teams
    Joel on Software 2018-05-03T12:58:25+00:00
  1144. Cleaning the VAXstation
    Pizza Box Computer 2018-05-04T02:40:00+00:00
  1145. So Long, Blogspot
    int10h.org - VileR's blog 2018-05-05T13:45:28+00:00
  1146. Death Trash
    DeathTrash 2018-05-05T15:02:54+00:00
  1147. Zelda – The Bling-Bling Offset
    Simonschreibt. 2018-05-05T20:19:35+00:00
  1148. pâron. the feeling that no matter what you do is always somehow...
    The Dictionary of Obscure Sorrows 2018-05-15T12:22:03+00:00
  1149. ⋇⋇
    Romain Laurent 2018-05-17T01:36:01+00:00
  1150. Fun with Macros: Gathering
    Steve Losh 2018-05-21T16:05:00+00:00
  1151. midding
    The Dictionary of Obscure Sorrows 2018-05-22T08:48:29+00:00
  1152. Flexi IBM VGA Font: a Scalable Take on Text Mode
    int10h.org - VileR's blog 2018-05-22T21:01:21+00:00
  1153. imbroglio notes 13 - phlogiston
    Mighty Vision 2018-05-24T15:55:00+00:00
  1154. imbroglio notes 13 - phlogiston
    Mighty Vision 2018-05-24T15:55:00+00:00
  1155. Taking Decent Photos of your CRT TV Screen
    int10h.org - VileR's blog 2018-06-16T20:37:28+00:00
  1156. Game #73: Dungeon Master (SNES) - All Things in Moderation, Including Moderation (Finished)
    The RPG Consoler 2018-06-19T02:49:00+00:00
  1157. 3D-Printed (Baby) Drum Pedal
    daftmike's blog 2018-07-02T19:02:00+00:00
  1158. A yak shave with SGI's EFS
    Pizza Box Computer 2018-07-04T19:00:00+00:00
  1159. Digital DECstation 5000/200
    Pizza Box Computer 2018-07-07T18:00:00+00:00
  1160. Filling in more cracks
    Pizza Box Computer 2018-07-07T18:00:00+00:00
  1161. HP 9000 Model 425e
    Pizza Box Computer 2018-07-07T18:00:00+00:00
  1162. Sun SPARCstation 20
    Pizza Box Computer 2018-07-07T18:00:00+00:00
  1163. Sun SunBlade 150
    Pizza Box Computer 2018-07-07T18:00:00+00:00
  1164. Fun with Macros: If-Let and When-Let
    Steve Losh 2018-07-09T16:00:00+00:00
  1165. Game #74: Sorcerer's Kingdom (Genesis) - Understandably Forgotten (Finished)
    The RPG Consoler 2018-07-11T03:26:00+00:00
  1166. Photo
    Terrible Banana 2018-07-12T04:07:50+00:00
  1167. Hey “anonymous” (where is trademark law when you need it?), f*ck you.
    LESSIG Blog 2018-07-12T12:27:47+00:00
  1168. Optimizing a breadth-first search
    Juho Snellman's Weblog 2018-07-23T16:00:00+00:00
  1169. (⊙_◎)
    Romain Laurent 2018-07-30T06:41:49+00:00
  1170. Portable LumipenLatest project from Ishikawa Senoo Laboratory...
    prosthetic knowledge 2018-07-31T22:09:54+00:00
  1171. APPARATUMMusical interactive interface from pangenerator is...
    prosthetic knowledge 2018-08-01T22:41:40+00:00
  1172. BRUTEGerman wine maker whose branding (put together by Patrik...
    prosthetic knowledge 2018-08-03T16:14:41+00:00
  1173. The Reeplicator AIProject by Rama Allen of The Mill creates a...
    prosthetic knowledge 2018-08-05T20:29:37+00:00
  1174. Ultra MaryProject by Anastasia Alekhina is a collection of LED...
    prosthetic knowledge 2018-08-05T21:16:01+00:00
  1175. Sunday evening mood
    Romain Laurent 2018-08-06T01:30:34+00:00
  1176. Neural BeatboxCoding project from Nao Tokui uses neural...
    prosthetic knowledge 2018-08-07T15:41:03+00:00
  1177. The Alternative Late Show with Stephen ColbertShort video...
    prosthetic knowledge 2018-08-07T21:13:59+00:00
  1178. TVCGAFIX Utilities - Adjust CGA Output for TV
    int10h.org - VileR's blog 2018-08-07T22:51:28+00:00
  1179. After the storm
    Romain Laurent 2018-08-08T18:47:24+00:00
  1180. The BarcodersMusical project featuring an ensemble including Ei...
    prosthetic knowledge 2018-08-08T22:40:58+00:00
  1181. Fast Pix2PixProject from Zaid Alyafeai presents a faster...
    prosthetic knowledge 2018-08-13T16:03:32+00:00
  1182. Text to ImageLatest web-based project from Cristóbal...
    prosthetic knowledge 2018-08-15T21:32:03+00:00
  1183. prostheticknowledge: Fast Pix2Pix Project from Zaid Alyafeai...
    prosthetic knowledge 2018-08-17T16:45:05+00:00
  1184. Inline building in upcoming Evennia 0.8
    Evennia Devblog RSS Feed 2018-08-18T00:00:00+00:00
  1185. Inline building in upcoming Evennia 0.8
    Griatch's Evennia musings 2018-08-18T12:23:00+00:00
  1186. Dynamic density shaping of photokinetic E. coliResearch from...
    prosthetic knowledge 2018-08-18T13:53:43+00:00
  1187. Recycle-GANGraphics research from Carnegie Mellon’s School of...
    prosthetic knowledge 2018-08-18T20:52:30+00:00
  1188. InventoryProject from Oddviz is a collection of images composed...
    prosthetic knowledge 2018-08-20T14:06:24+00:00
  1189. Video-to-Video SynthesisAmazing graphics research from @nvidia...
    prosthetic knowledge 2018-08-20T15:41:27+00:00
  1190. don’t do it mancall the national suicide prevention hotline at...
    Terrible Banana 2018-08-22T04:40:11+00:00
  1191. Everybody Dance NowGraphics research from UC Berkeley is the...
    prosthetic knowledge 2018-08-24T12:45:50+00:00
  1192. A Road to Common Lisp
    Steve Losh 2018-08-27T15:50:00+00:00
  1193. What The Hell Was The Microsoft Network?
    codersnotes.com 2018-08-29T07:00:00+00:00
  1194. 1: Michael Drogalis on Pyrostore's Acquisition, the future of Onyx, and stream processing
    The REPL 2018-08-29T08:38:56+00:00
  1195. AIRBNB HOSTSOnline game by Dries Depoorter and David...
    prosthetic knowledge 2018-08-29T17:21:20+00:00
  1196. Fast Pix2Pix - UpdateInteresting additions to project by Zaid...
    prosthetic knowledge 2018-08-30T14:20:05+00:00
  1197. Photo
    .mattfraction 2018-09-01T00:00:27+00:00
  1198. photos-of-space:Active Prominences on a Quiet Sun (Photo: Alan...
    .mattfraction 2018-09-02T00:00:36+00:00
  1199. Time to call it a day ...
    prosthetic knowledge 2018-09-05T21:43:49+00:00
  1200. The Architecture of the Medieval Page
    medievalbooks 2018-09-07T16:36:36+00:00
  1201. 2: Daniel Higginbotham on Specmonstah, Clojure Spec, and Ent walking trees
    The REPL 2018-09-10T09:45:06+00:00
  1202. casualmenofaction: mattfractionblog: In the last two shoots...
    .mattfraction 2018-09-10T16:31:39+00:00
  1203. kierongillen: die-comic: For more information go read the first...
    .mattfraction 2018-09-10T23:33:53+00:00
  1204. Apple's topsy-turvy iPhone lineup
    Article on Coyote Cartography 2018-09-13T18:20:57+00:00
  1205. 3: Mike Fikes on ClojureScript type inference, Graal, and Clojurists Together
    The REPL 2018-09-16T12:29:45+00:00
  1206. speakingparts: Les garçons sauvagesBertrand Mandico 2017
    .mattfraction 2018-09-17T00:00:32+00:00
  1207. untitled
    .mattfraction 2018-09-17T01:49:06+00:00
  1208. Me, Myself, and I: The Story of Two Medieval Selfies
    medievalbooks 2018-09-20T17:52:09+00:00
  1209. Introducing Live Builds
    Citybound Devblog 2018-09-21T00:00:00+00:00
  1210. abandonedandurbex: Stairwell in an abandoned button factory
    .mattfraction 2018-09-23T00:00:39+00:00
  1211. Bloated
    Fabien Sanglard 2018-09-23T01:08:45+00:00
  1212. oldshowbiz: Don’t let anybody tell you that there were no women...
    .mattfraction 2018-09-24T00:00:23+00:00
  1213. nevver: David Shrigley
    .mattfraction 2018-09-25T00:00:26+00:00
  1214. 4: Bruce Hauman on interactive development, Figwheel, and Rebel Readline
    The REPL 2018-09-25T20:39:22+00:00
  1215. abandonedandurbex: 132 year old rifle that was found leaning up...
    .mattfraction 2018-09-26T00:00:36+00:00
  1216. Photo
    .mattfraction 2018-09-27T00:00:37+00:00
  1217. bushdog: (1958 Danelectro U3 - Thunder Road Guitars Seattleから)
    .mattfraction 2018-09-28T00:00:25+00:00
  1218. seanhowe: Advertisement for the Psychedelicatessen, 164 Avenue...
    .mattfraction 2018-09-29T00:00:43+00:00
  1219. HELLO COMBLIUMIBUS HELLO CXC @skellyskellyskelly x me DROP HOT...
    .mattfraction 2018-09-29T14:21:01+00:00
  1220. Evennia 0.8 released
    Evennia Devblog RSS Feed 2018-09-30T00:00:00+00:00
  1221. biomorphosis:When you flip bats upside down they become...
    .mattfraction 2018-09-30T00:00:35+00:00
  1222. sorry errbody @ cxc. i spent the previous evening vomiting more...
    .mattfraction 2018-09-30T11:49:30+00:00
  1223. Evennia 0.8 released
    Griatch's Evennia musings 2018-09-30T19:35:00+00:00
  1224. oldshowbiz: The Burt Reynolds Late Show
    .mattfraction 2018-10-01T00:00:15+00:00
  1225. 5: Looking At The Web After Tomorrow with Nikita Prokopov
    The REPL 2018-10-03T22:15:03+00:00
  1226. Evennia in Hacktoberfest 2018
    Griatch's Evennia musings 2018-10-04T08:34:00+00:00
  1227. Doodles in Medieval Manuscripts
    medievalbooks 2018-10-05T18:46:26+00:00
  1228. Notes on Type Layouts and ABIs in Rust - Faultlore
    Faultlore 2018-10-09T00:00:00+00:00
  1229. 6: Thomas Heller on Shadow CLJS
    The REPL 2018-10-10T03:49:03+00:00
  1230. wildragon:Probably Flinthook’s Bounty Battle storyline in a...
    Tribute Games 2018-10-17T20:00:17+00:00
  1231. 7: Ben Brinckerhoff on Clojure Spec and Error Messages
    The REPL 2018-10-18T01:09:54+00:00
  1232. What does Stack Overflow want to be when it grows up?
    Coding Horror 2018-10-22T10:52:32+00:00
  1233. 8: Elana Hashman on Debian and Clojure
    The REPL 2018-10-24T21:21:37+00:00
  1234. 9: Hannah Henderson on Continuous Integration at CircleCI
    The REPL 2018-11-02T03:31:00+00:00
  1235. Medieval Book Carousels
    medievalbooks 2018-11-02T16:54:17+00:00
  1236. 10: Howard Lewis Ship on GraphQL and Lacinia
    The REPL 2018-11-12T09:32:59+00:00
  1237. The Kinds of Implementation-Defined? - Faultlore
    Faultlore 2018-11-13T00:00:00+00:00
  1238. glamoramamama75:
    .mattfraction 2018-11-13T09:33:46+00:00
  1239. Low cognitive load blogging
    Article on Coyote Cartography 2018-11-15T00:40:53+00:00
  1240. The new iPad Pro
    Article on Coyote Cartography 2018-11-16T19:28:03+00:00
  1241. 11: Saskia Lindner on re-frame-10x, compassionate coding, and mindfulness
    The REPL 2018-11-25T19:00:00+00:00
  1242. 12: Clojure documentation with Martin Klepsch
    The REPL 2018-11-27T19:00:00+00:00
  1243. The Cluster Clan is a colorful bunch of swashbucklers! See...
    Tribute Games 2018-11-28T19:06:49+00:00
  1244. Photo
    Terrible Banana 2018-11-29T21:23:12+00:00
  1245. The 12 Days of Tribute Giveaway – How To WinStarting December...
    Tribute Games 2018-12-01T17:00:38+00:00
  1246. Installing A/UX on the Quadra 610
    Pizza Box Computer 2018-12-02T21:45:00+00:00
  1247. 13: High performance Clojure numerics with Chris Nuernberger
    The REPL 2018-12-04T19:00:00+00:00
  1248. Let's talk about the Tumblrpocalypse
    Article on Coyote Cartography 2018-12-04T20:57:52+00:00
  1249. 14: ClojureScript, Lumo, and Lambdas with Antonio Monteiro
    The REPL 2018-12-05T02:23:28+00:00
  1250. Game Engine Black Book: Wolfenstein 3D, 2nd Edition
    Fabien Sanglard 2018-12-06T00:00:00+00:00
  1251. tributegames: The 12 Days of Tribute Giveaway – How To...
    Tribute Games 2018-12-06T16:13:02+00:00
  1252. Vimways: From .vimrc to .vim
    Arabesque 2018-12-08T08:50:17+00:00
  1253. Game Engine Black Book: DOOM
    Fabien Sanglard 2018-12-10T00:00:00+00:00
  1254. FUCK YOU TUMBLR NO PANTS ON NO FUCKS GIVEN YOU DONT BAN ME I BAN YOU EAT MY WHOLE COOKIE ASS…
    .mattfraction 2018-12-10T12:15:26+00:00
  1255. Vimways: Runtime hackery
    Arabesque 2018-12-10T21:27:24+00:00
  1256. How the Dreamcast copy protection was defeated
    Fabien Sanglard 2018-12-11T00:00:00+00:00
  1257. 🔴 NOW LIVE! ➡️ www.twitch.tv/tributegames12 Days of Tribute...
    Tribute Games 2018-12-13T21:00:26+00:00
  1258. Ninja Senki Now Available!
    Tribute Games 2018-12-20T15:00:24+00:00
  1259. Get up to 80% OFF Flinthook, Curses ‘N Chaos, Mercenary Kings:...
    Tribute Games 2018-12-20T20:00:29+00:00
  1260. Deciphering the postcard sized raytracer
    Fabien Sanglard 2018-12-24T00:00:00+00:00
  1261. How DOOM fire was made
    Fabien Sanglard 2018-12-28T00:00:00+00:00
  1262. Into 2019
    Evennia Devblog RSS Feed 2019-01-02T00:00:00+00:00
  1263. Into 2019!
    Griatch's Evennia musings 2019-01-02T19:22:00+00:00
  1264. River 2k19 Edition
    Simonschreibt. 2019-01-07T08:51:02+00:00
  1265. Toggle Redshift with Keyboard Shortcut
    Winny's Blog 2019-01-09T13:04:00+00:00
  1266. 15: Clojure at Apple with David Taylor
    The REPL 2019-01-11T19:14:56+00:00
  1267. Publishing with org-static-blog
    Winny's Blog 2019-01-11T21:00:00+00:00
  1268. GNU C Style
    Winny's Blog 2019-01-13T06:00:00+00:00
  1269. Compartmentalization
    Romain Laurent 2019-01-14T19:13:24+00:00
  1270. quiet blog year
    Mighty Vision 2019-01-17T00:20:00+00:00
  1271. quiet blog year
    Mighty Vision 2019-01-17T00:20:00+00:00
  1272. 16: Monorepos and monologues with Alex Engelberg
    The REPL 2019-01-18T18:46:47+00:00
  1273. A bit of a stretch
    Romain Laurent 2019-01-18T23:26:23+00:00
  1274. Blink Shell: First Thoughts
    Winny's Blog 2019-01-23T03:35:00+00:00
  1275. The Oldest Surviving Printed Advertisement in English (London, 1477)
    medievalbooks 2019-01-24T18:27:27+00:00
  1276. Emotional Flow
    Romain Laurent 2019-01-24T20:05:51+00:00
  1277. Fighting For A Miracle: Venezuela’s War Between Past, Present, and Future
    Not My Empire 2019-01-27T03:07:57+00:00
  1278. 17: Editing Clojure code with Shaun Lebron
    The REPL 2019-02-04T06:30:00+00:00
  1279. insidematthieu:A fanimation of Leo and Lea from Curses and...
    Tribute Games 2019-02-05T15:53:19+00:00
  1280. A digression about Facebook
    Article on Coyote Cartography 2019-02-05T16:54:43+00:00
  1281. untitled
    Article on Coyote Cartography 2019-02-05T21:48:10+00:00
  1282. untitled
    Article on Coyote Cartography 2019-02-13T15:47:45+00:00
  1283. 📢 IMPORTANT NEWS 📢We are excited to announce that we will have...
    Tribute Games 2019-02-13T21:25:30+00:00
  1284. The Cloud Is Just Someone Else’s Computer
    Coding Horror 2019-02-17T02:15:26+00:00
  1285. Randomized trial on gender in Overwatch
    Dan Luu 2019-02-19T00:00:00+00:00
  1286. 18: Testing Clojure and ClojureScript with Arne Brasseur
    The REPL 2019-02-20T03:54:45+00:00
  1287. Ilhan Omar Is Fighting The Real Anti-Semites
    Not My Empire 2019-03-09T15:47:14+00:00
  1288. 19: Formatting Clojure code with Shaun Lebron
    The REPL 2019-03-12T18:00:00+00:00
  1289. New Upcoming Game: Panzer Paladin
    Tribute Games 2019-03-13T15:00:26+00:00
  1290. Panzer Paladin Announced by Tribute Games
    Tribute Games 2019-03-14T16:06:36+00:00
  1291. Google Summer of Code 2019
    Neovim 2019-03-17T00:00:00+00:00
  1292. 20: Clojure MXNet with Carin Meier
    The REPL 2019-03-19T18:00:00+00:00
  1293. Why Hashbrown Does A Double-Lookup - Faultlore
    Faultlore 2019-03-20T00:00:00+00:00
  1294. Irony and the Alt-Right: What the Christchurch Shooter Tells Us About Belief
    Not My Empire 2019-03-21T23:58:31+00:00
  1295. Ten (More) Brief Thoughts On Russiagate
    Not My Empire 2019-03-26T00:46:31+00:00
  1296. 21: Looking at Clojure through the mindset of business with Jonathan Boston
    The REPL 2019-03-26T18:00:00+00:00
  1297. The next CEO of Stack Overflow
    Joel on Software 2019-03-28T14:00:53+00:00
  1298. The story of the Rendition Vérité 1000
    Fabien Sanglard 2019-04-01T00:00:00+00:00
  1299. 22: Cursive IDE with Colin Fleming
    The REPL 2019-04-02T18:19:46+00:00
  1300. The story of the 3dfx Voodoo 1
    Fabien Sanglard 2019-04-04T00:00:00+00:00
  1301. pixelartus: Panzer Paladin, is an upcoming action platformer in...
    Tribute Games 2019-04-04T21:07:37+00:00
  1302. Medium thinks it's a brand
    Article on Coyote Cartography 2019-04-13T16:40:36+00:00
  1303. 23: Elements of Clojure with Zach Tellman
    The REPL 2019-04-18T23:06:18+00:00
  1304. Steaming on Eating Jam
    Evennia Devblog RSS Feed 2019-04-25T00:00:00+00:00
  1305. Steaming on, eating jam
    Griatch's Evennia musings 2019-04-25T09:42:00+00:00
  1306. Podcast about Evennia
    Evennia Devblog RSS Feed 2019-05-09T00:00:00+00:00
  1307. Podcast about Evennia
    Griatch's Evennia musings 2019-05-09T13:38:00+00:00
  1308. Celebrating 8 years of Tribute!
    Tribute Games 2019-05-09T15:00:30+00:00
  1309. Writing a procedural puzzle generator
    Juho Snellman's Weblog 2019-05-14T15:00:00+00:00
  1310. De-uglifying 40-Column Text Games for VGA
    int10h.org - VileR's blog 2019-05-15T22:16:53+00:00
  1311. Game Engine Black Book update
    Fabien Sanglard 2019-05-17T00:00:00+00:00
  1312. Creating Evscaperoom Part 1
    Evennia Devblog RSS Feed 2019-05-18T00:00:00+00:00
  1313. Random (but not angry) thoughts on "Game of Thrones"
    Article on Coyote Cartography 2019-05-18T02:58:26+00:00
  1314. Creating Evscaperoom, part 1
    Griatch's Evennia musings 2019-05-18T18:50:00+00:00
  1315. Here's My Type, So Initialize Me Maybe - Faultlore
    Faultlore 2019-05-21T00:00:00+00:00
  1316. Fontraption (a VGA Text Mode Font Editor)
    int10h.org - VileR's blog 2019-05-22T20:56:31+00:00
  1317. Creating Evscaperoom Part 2
    Evennia Devblog RSS Feed 2019-05-26T00:00:00+00:00
  1318. Creating Evscaperoom, part 2
    Griatch's Evennia musings 2019-05-26T09:03:00+00:00
  1319. An Exercise Program for the Fat Web
    Coding Horror 2019-05-30T11:04:52+00:00
  1320. Citybound as a Truly Moddable and Educational Simulation
    Citybound Devblog 2019-06-02T00:00:00+00:00
  1321. 24: Crux, a new bitemporal database from JUXT
    The REPL 2019-06-12T06:27:11+00:00
  1322. Installing pyftdi on Ubuntu 18.04 for FT232H and FT2232H boards
    The Grymoire 2019-06-12T14:11:45+00:00
  1323. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T06:39:20+00:00
  1324. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T06:48:20+00:00
  1325. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T07:15:32+00:00
  1326. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:10:02+00:00
  1327. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:18:13+00:00
  1328. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:33:10+00:00
  1329. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:40:08+00:00
  1330. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:40:22+00:00
  1331. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-20T08:40:32+00:00
  1332. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-06-21T05:32:22+00:00
  1333. 25: Dragan Djuric on Neanderthal
    The REPL 2019-06-26T07:00:00+00:00
  1334. Evennia 0.9 released
    Evennia Devblog RSS Feed 2019-07-04T00:00:00+00:00
  1335. Evennia 0.9 released
    Griatch's Evennia musings 2019-07-04T17:45:00+00:00
  1336. 26: Nathan Marz on a new programming paradigm
    The REPL 2019-07-10T07:00:00+00:00
  1337. Files are fraught with peril
    Dan Luu 2019-07-12T00:00:00+00:00
  1338. Review: Brydge 12.9″ Keyboard Pro
    Article on Coyote Cartography 2019-07-12T14:43:43+00:00
  1339. P1 SELECT
    Mighty Vision 2019-07-13T17:22:00+00:00
  1340. P1 SELECT
    Mighty Vision 2019-07-13T17:22:00+00:00
  1341. Photo
    Terrible Banana 2019-07-14T17:11:12+00:00
  1342. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-07-16T09:17:28+00:00
  1343. For every one drawing I scan, there’s at least 10 that I...
    HIGHLIGHTER AND SHARPIE PARTY 2019-07-19T08:20:10+00:00
  1344. 27: Eric Normand on teaching Clojure
    The REPL 2019-07-24T07:00:00+00:00
  1345. Swisstable, a Quick and Dirty Description - Faultlore
    Faultlore 2019-07-27T00:00:00+00:00
  1346. Open URL in existing Qutebrowser from Emacs Daemon on Gentoo
    Winny's Blog 2019-07-28T05:00:00+00:00
  1347. The Danger of fuzzy matching over one's PATH
    Winny's Blog 2019-08-02T11:00:00+00:00
  1348. 👁
    Romain Laurent 2019-08-06T20:32:37+00:00
  1349. 28: Ambrose Bonnaire-Sergeant on Typed Clojure
    The REPL 2019-08-12T05:28:41+00:00
  1350. The iPad needs more focus on the little things
    Article on Coyote Cartography 2019-08-12T15:36:07+00:00
  1351. Happy Birthday Scott Pilgrim vs. The World!08/13/2010 What if...
    Tribute Games 2019-08-13T16:45:31+00:00
  1352. Electric Geek Transportation Systems
    Coding Horror 2019-08-20T11:35:16+00:00
  1353. Making and Tool-Making
    Citybound Devblog 2019-08-24T00:00:00+00:00
  1354. Tool Making Follow-Up: What I Mean by Friction
    Citybound Devblog 2019-08-25T00:00:00+00:00
  1355. 29: Marc O'Morain on adding Windows support to CircleCI
    The REPL 2019-08-26T17:00:00+00:00
  1356. What Remains Technical Breakdown
    dustmop.io blog 2019-09-10T17:42:32+00:00
  1357. lilo
    The Dictionary of Obscure Sorrows 2019-09-11T20:16:05+00:00
  1358. The Rise of the Electric Scooter
    Coding Horror 2019-09-12T07:24:32+00:00
  1359. Trailer
    Verb Your Enthusiasm 2019-09-23T00:49:51+00:00
  1360. Welcome, Prashanth!
    Joel on Software 2019-09-24T14:00:17+00:00
  1361. Death Trash will enter Steam Early Access soonWe have some...
    DeathTrash 2019-09-25T06:15:35+00:00
  1362. Text Rendering Hates You - Faultlore
    Faultlore 2019-09-28T00:00:00+00:00
  1363. Today, Impeachment; Tomorrow, Riot!
    Not My Empire 2019-09-28T21:12:20+00:00
  1364. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:47:47+00:00
  1365. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:48:47+00:00
  1366. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:49:35+00:00
  1367. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:50:11+00:00
  1368. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:50:40+00:00
  1369. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T09:58:38+00:00
  1370. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T10:20:38+00:00
  1371. Photo
    HIGHLIGHTER AND SHARPIE PARTY 2019-09-29T10:44:09+00:00
  1372. Blackifying and fixing bugs
    Evennia Devblog RSS Feed 2019-09-30T00:00:00+00:00
  1373. Blackifying and fixing bugs
    Griatch's Evennia musings 2019-09-30T15:39:00+00:00
  1374. Globalisms, Real And Imagined: Hong Kong, Haiti, And The New Internationals
    Not My Empire 2019-10-06T02:26:11+00:00
  1375. Photo
    Terrible Banana 2019-10-06T14:47:05+00:00
  1376. 2019 Episode 1
    Verb Your Enthusiasm 2019-10-10T19:51:18+00:00
  1377. Big Mouth, Little Fascisms
    Not My Empire 2019-10-11T22:05:11+00:00
  1378. 2019 Episode 2
    Verb Your Enthusiasm 2019-10-20T04:04:21+00:00
  1379. Trump Is No Isolationist, He Just Hates Democracy
    Not My Empire 2019-10-22T01:16:20+00:00
  1380. 30: Bobby Calderwood on Kafka and Fintech
    The REPL 2019-10-22T01:36:33+00:00
  1381. Release 2.5.2
    The Ground Gives Way 2019-10-26T15:19:39+00:00
  1382. A trip down NBA Jam graphics pipeline
    Fabien Sanglard 2019-10-28T00:00:00+00:00
  1383. The Facebook Crisis Is Bigger Than Fact-Checking
    Not My Empire 2019-10-28T23:02:34+00:00
  1384. Track down basic Emacs bugs & hangs
    Winny's Blog 2019-11-01T05:00:00+00:00
  1385. Breaking Bad: The Incomplete History of the St Albans Bible
    medievalbooks 2019-11-01T16:47:30+00:00
  1386. On CTIA v. City of Berkeley
    LESSIG Blog 2019-11-02T12:17:35+00:00
  1387. GDG Milwaukee 2019 DevFest - We participated!
    Winny's Blog 2019-11-06T07:39:00+00:00
  1388. How Swift Achieved Dynamic Linking Where Rust Couldn't - Faultlore
    Faultlore 2019-11-07T00:00:00+00:00
  1389. 2019 Episode 3
    Verb Your Enthusiasm 2019-11-08T02:19:46+00:00
  1390. 31: Joel Holdbrooks on Meander
    The REPL 2019-11-08T03:30:04+00:00
  1391. 2019 Episode 4
    Verb Your Enthusiasm 2019-11-11T15:49:27+00:00
  1392. 32: Clojure, Kafka, and OPERATR with Derek Troy-West
    The REPL 2019-11-13T06:00:00+00:00
  1393. Won MSOE x Google Cloud Hackathon
    Winny's Blog 2019-11-19T01:49:00+00:00
  1394. current projects
    Mighty Vision 2019-11-22T21:41:00+00:00
  1395. current projects
    Mighty Vision 2019-11-22T21:41:00+00:00
  1396. 33: Peter Strömberg on Calva, a Clojure plugin for VS Code
    The REPL 2019-11-23T01:44:06+00:00
  1397. Milwaukee Code Camp
    Winny's Blog 2019-11-24T01:24:00+00:00
  1398. On Making a Pizza Delivery Game I recently finished up working on a prototype of a small pizza...
    jordan orelli 2019-11-29T15:43:14+00:00
  1399. This blog would have been 10 years old today (it’s still retired).10 years covering the subject of...
    prosthetic knowledge 2019-12-01T21:18:05+00:00
  1400. Strike Commander: Interview with Frank Savage
    Fabien Sanglard 2019-12-03T00:00:00+00:00
  1401. So, how’s that retirement thing going, anyway?
    Joel on Software 2019-12-05T22:51:39+00:00
  1402. Using bash to monitor devices entering/exiting a LAN
    The Grymoire 2019-12-09T16:41:16+00:00
  1403. Reverse Engineering the DirecTV App’s DVR Authentication
    Neglected Potential 2019-12-19T19:45:14+00:00
  1404. How I'm Implementing Procedural Architecture
    Citybound Devblog 2019-12-21T00:00:00+00:00
  1405. The Research That Goes Into Citybound
    Citybound Devblog 2019-12-22T00:00:00+00:00
  1406. Why I'm moving from Patreon to Github Sponsors
    Citybound Devblog 2019-12-23T00:00:00+00:00
  1407. How to fix early framebuffer problems, or "Can I type my disk password yet??"
    Winny's Blog 2019-12-25T08:37:00+00:00
  1408. Apparently a new line of attack against Medicare for All is that the Medicare reimbursement rate is…
    Squashed 2019-12-26T20:57:17+00:00
  1409. small progress update
    Mighty Vision 2019-12-28T18:04:00+00:00
  1410. small progress update
    Mighty Vision 2019-12-28T18:04:00+00:00
  1411. The Polygons of Another World
    Fabien Sanglard 2020-01-01T00:00:00+00:00
  1412. The Polygons of Another World: Amiga
    Fabien Sanglard 2020-01-02T00:00:00+00:00
  1413. The Polygons of Another World: Atari ST
    Fabien Sanglard 2020-01-03T00:00:00+00:00
  1414. The Polygons of Another World: PC DOC
    Fabien Sanglard 2020-01-04T00:00:00+00:00
  1415. Algorithms interviews: theory vs. practice
    Dan Luu 2020-01-05T00:00:00+00:00
  1416. The Polygons of Another World: Genesis
    Fabien Sanglard 2020-01-05T00:00:00+00:00
  1417. Photo
    Terrible Banana 2020-01-05T21:23:45+00:00
  1418. Photo
    Terrible Banana 2020-01-06T21:52:10+00:00
  1419. There Is No Case For War With Iran
    Not My Empire 2020-01-07T04:12:07+00:00
  1420. Switching website to GitLab Pages
    Winny's Blog 2020-01-07T19:16:00+00:00
  1421. The Polygons of Another World: SNES
    Fabien Sanglard 2020-01-19T00:00:00+00:00
  1422. this guy sucks
    Terrible Banana 2020-01-20T23:46:34+00:00
  1423. NeXTstep on the HP 712 Part 1: Installation
    Pizza Box Computer 2020-01-21T01:20:00+00:00
  1424. The Polygons of Another World: GBA
    Fabien Sanglard 2020-01-26T00:00:00+00:00
  1425. 95%-ile isn't that good
    Dan Luu 2020-02-07T00:00:00+00:00
  1426. Photo
    Terrible Banana 2020-02-08T04:07:21+00:00
  1427. Suspicious discontinuities
    Dan Luu 2020-02-18T00:00:00+00:00
  1428. A Dozen Small Games
    jordan orelli 2020-02-23T18:25:04+00:00
  1429. PANZER PALADIN Coming Soon to Nintendo Switch and Steam! Click...
    Tribute Games 2020-02-25T16:01:23+00:00
  1430. Photo
    garfield minus garfield 2020-03-01T14:24:59+00:00
  1431. On the eve of Super Tuesday...
    Squashed 2020-03-02T22:51:45+00:00
  1432. The growth of command line options, 1979-Present
    Dan Luu 2020-03-03T00:00:00+00:00
  1433. Why All of Bernie's Supporters should vote for Warren Instead Because of Math
    Squashed 2020-03-03T01:34:14+00:00
  1434. The beautiful machine
    Fabien Sanglard 2020-03-06T00:00:00+00:00
  1435. How (some) good corporate engineering blogs are written
    Dan Luu 2020-03-11T00:00:00+00:00
  1436. The Polygons of Another World: Jaguar
    Fabien Sanglard 2020-03-13T00:00:00+00:00
  1437. GTA V – The Wormy Fountain
    Simonschreibt. 2020-03-20T22:10:01+00:00
  1438. 34: CIDER and tending the Orchard with Bozhidar Batsov
    The REPL 2020-03-24T19:00:00+00:00
  1439. The Polygons of DOOM: PSX
    Fabien Sanglard 2020-03-26T00:00:00+00:00
  1440. Ideas for Upcoming Livestreams (Pedestrians & Epidemics)
    Citybound Devblog 2020-03-28T00:00:00+00:00
  1441. Extending a wireless LAN with a bridged Ethernet LAN using Mikrotik RouterOS
    Winny's Blog 2020-03-29T17:23:00+00:00
  1442. 35: Mature Clojure codebases with Łukasz Korecki
    The REPL 2020-04-01T23:17:03+00:00
  1443. Crafting “Crafting Interpreters”
    journal.stuffwithstuff.com 2020-04-05T07:00:00+00:00
  1444. Spring updates while trying to stay healthy
    Evennia Devblog RSS Feed 2020-04-14T00:00:00+00:00
  1445. Newsletter #9 - Three's company
    Neovim 2020-04-14T00:00:00+00:00
  1446. Spring updates while trying to stay healthy
    Griatch's Evennia musings 2020-04-14T16:31:00+00:00
  1447. Auto-Injecting Files into an Active PCem/86Box Machine
    int10h.org - VileR's blog 2020-04-15T09:19:00+00:00
  1448. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2020-04-17T13:14:23+00:00
  1449. A week in the life of Winston
    Winny's Blog 2020-04-19T00:27:00+00:00
  1450. Building a PC, Part IX: Downsizing
    Coding Horror 2020-04-19T23:56:03+00:00
  1451. The Making Of Stunt Island
    Fabien Sanglard 2020-04-21T00:00:00+00:00
  1452. Debugging Zathura, GTK (don't forget about seccomp)
    Winny's Blog 2020-04-25T03:49:00+00:00
  1453. 36: Clojure CLI tools with Michiel Borkent
    The REPL 2020-04-25T21:15:43+00:00
  1454. Revisiting the Businesscard Raytracer
    Fabien Sanglard 2020-05-01T00:00:00+00:00
  1455. Linux dmesg –follow (-w) not working?
    Winny's Blog 2020-05-01T02:04:00+00:00
  1456. An history of NVidia Stream Multiprocessor
    Fabien Sanglard 2020-05-02T00:00:00+00:00
  1457. preview for Imbroglio: Mizzenmast
    Mighty Vision 2020-05-02T15:37:00+00:00
  1458. preview for Imbroglio: Mizzenmast
    Mighty Vision 2020-05-02T15:37:00+00:00
  1459. imbroglio - expansion & crash
    Mighty Vision 2020-05-05T16:58:00+00:00
  1460. imbroglio - expansion & crash
    Mighty Vision 2020-05-05T16:58:00+00:00
  1461. Memories of Working on Homestuck - Faultlore
    Faultlore 2020-05-06T00:00:00+00:00
  1462. 0x10 rules
    Fabien Sanglard 2020-05-07T00:00:00+00:00
  1463. definite plan
    Mighty Vision 2020-05-10T11:35:00+00:00
  1464. definite plan
    Mighty Vision 2020-05-10T11:35:00+00:00
  1465. animenostalgia:Gunnm (aka Battle Angel Alita) by Yukito Kishiro
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:10:58+00:00
  1466. pinkbubblegum3:Katsuya Terada ♥
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:11:12+00:00
  1467. thevideogameartarchive: Artwork from ‘ESWAT’ on the Sega...
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:11:37+00:00
  1468. animarchive:Japanese artist and sculptor Kow Yokoyama -...
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:11:45+00:00
  1469. curatorofthisdigitalmorass: SYD MEAD
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:11:51+00:00
  1470. rocketumbl: rocketumbl: ファイアボールSG Ma.K. あま製作
    ONO-SENDAI CYBERSPACE 7 2020-05-10T16:14:33+00:00
  1471. ⚔️ The Panzer Paladin Gameplay Trailer is here! ⚔️ ✅ Click here...
    Tribute Games 2020-05-15T14:01:43+00:00
  1472. Revisiting the postcard pathtracer
    Fabien Sanglard 2020-05-18T00:00:00+00:00
  1473. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2020-05-18T22:03:26+00:00
  1474. 37: The Clojurists Together Foundation with lvh
    The REPL 2020-05-21T09:13:00+00:00
  1475. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2020-05-23T14:57:13+00:00
  1476. A tale of Ghosts'n Goblins'n Crocos
    Fabien Sanglard 2020-05-30T00:00:00+00:00
  1477. A simple way to get more value from metrics
    Dan Luu 2020-05-30T07:06:34+00:00
  1478. A simple way to get more value from tracing
    Dan Luu 2020-05-31T07:06:34+00:00
  1479. Passing runtime data to AWK
    Arabesque 2020-05-31T11:55:54+00:00
  1480. Straight Out Of Furlough
    GAMEPOPPER 2020-05-31T15:07:51+00:00
  1481. Finding the Story
    Dan Luu 2020-06-02T07:05:34+00:00
  1482. I want to make sure that nobody is missing what Trump is doing he calls out for “Law and Order” and…
    Squashed 2020-06-03T00:34:10+00:00
  1483. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2020-06-05T17:14:07+00:00
  1484. Discret 11, the French TV encryption of the 80's
    Fabien Sanglard 2020-06-07T00:00:00+00:00
  1485. NeXTstep on the HP 712 Part 2: Getting Software
    Pizza Box Computer 2020-06-09T13:45:00+00:00
  1486. Life Harvester #18: Frances Beal, Sylvia Rivera, Walter Benjamin
    Life Harvester 2020-06-15T16:33:31+00:00
  1487. HASH: a free, online platform for modeling the world
    Joel on Software 2020-06-18T14:12:25+00:00
  1488. agnosthesia
    The Dictionary of Obscure Sorrows 2020-06-20T19:11:05+00:00
  1489. Alyse Galvin on Coronavirus in Alaska
    Idle Words 2020-06-24T13:30:00+00:00
  1490. How do cars do in out-of-sample crash testing?
    Dan Luu 2020-06-30T07:06:34+00:00
  1491. Ultimate Oldschool PC Font Pack v2.0 Released
    int10h.org - VileR's blog 2020-07-13T09:25:18+00:00
  1492. Life Harvester #19: Black, Young, & Educated
    Life Harvester 2020-07-15T10:34:00+00:00
  1493. The Gods Pocket Peak Trail
    Embedded in Academia 2020-07-23T15:41:01+00:00
  1494. Alive2 Part 3: Things You Can and Can’t Do with Undef in LLVM
    Embedded in Academia 2020-07-31T20:33:05+00:00
  1495. wrenmcdonald: Ex.Mag FULL METAL DREAMLAND, the genre-based...
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:19:30+00:00
  1496. Dragon’s Heaven - Makoto Kobayashi, Toshihiro Hirano
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:19:37+00:00
  1497. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:19:49+00:00
  1498. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:20:12+00:00
  1499. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:20:19+00:00
  1500. ultrakillblast:TRON (1982)
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:20:29+00:00
  1501. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:21:42+00:00
  1502. wrenmcdonald:Ex.Mag 01 back cover 💚
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:22:18+00:00
  1503. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:22:48+00:00
  1504. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:23:12+00:00
  1505. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:23:33+00:00
  1506. yodawgiheardyoulikemecha: Space dude
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:24:02+00:00
  1507. ravenkult: Fullfillment Center by Brian Sum...
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:24:11+00:00
  1508. Photo
    ONO-SENDAI CYBERSPACE 7 2020-08-02T13:24:41+00:00
  1509. ZZT Stories: The Reconstruction
    Posts on asie's blog 2020-08-04T20:30:00+00:00
  1510. ⚔️ Panzer Paladin: OUT NOW! ⚔️ ✅ Get it on Steam! ✅🎮 Get it on...
    Tribute Games 2020-08-07T15:05:56+00:00
  1511. Top 5 indie games of July 2020: The IND13 Picks
    Tribute Games 2020-08-07T15:07:13+00:00
  1512. Sara Huddleston on the Latino Vote in Iowa
    Idle Words 2020-08-08T00:40:00+00:00
  1513. Responsible and Effective Bugfinding
    Embedded in Academia 2020-08-17T18:36:43+00:00
  1514. Life Harvester #20: Prison Abolition
    Life Harvester 2020-08-18T16:33:34+00:00
  1515. Okay how did I not understand this situation until Steve Bannon...
    Squashed 2020-08-20T16:47:27+00:00
  1516. When NAT Bites — Use a Reverse VPN
    Winny's Blog 2020-08-31T05:00:00+00:00
  1517. Effective Political Giving
    Idle Words 2020-09-03T04:05:00+00:00
  1518. tinycartridge: Panzer Paladin is better than pretty much any...
    Tribute Games 2020-09-09T17:06:01+00:00
  1519. Life Harvester #21: T'Shuva, Electric Shavers, Sleep Headphones, Granola
    Life Harvester 2020-09-15T13:02:06+00:00
  1520. Git Push
    int10h.org - VileR's blog 2020-09-16T14:21:09+00:00
  1521. azspot: “The threat of increasing the size of the court to 13 might be enough to discourage...
    Squashed 2020-09-22T12:59:47+00:00
  1522. Recent Trends in Wealth-Holding by Race and Ethnicity: Evidence from the Survey of Consumer Finances
    Squashed 2020-09-24T18:55:42+00:00
  1523. Systemic racism really isn’t complicated or controversial
    Squashed 2020-09-24T21:06:42+00:00
  1524. Another analogy on education about white supremacy
    Squashed 2020-09-27T17:24:52+00:00
  1525. On Trump’s Taxes
    Squashed 2020-09-28T15:26:54+00:00
  1526. A personal example of systemic racism
    Squashed 2020-09-28T16:58:55+00:00
  1527. "This sleazy Supreme Court double-dealing is the last gasp of a corrupt Republican leadership, numb..."
    Squashed 2020-09-28T18:30:52+00:00
  1528. On Colorblindness
    Squashed 2020-10-01T22:29:10+00:00
  1529. Switching to Lenovo Carbon X1
    Fabien Sanglard 2020-10-02T00:00:00+00:00
  1530. 2020 Episode 1
    Verb Your Enthusiasm 2020-10-03T22:49:06+00:00
  1531. Protests and Power
    Idle Words 2020-10-04T22:17:00+00:00
  1532. 2020 Episode 2
    Verb Your Enthusiasm 2020-10-11T08:42:13+00:00
  1533. WHEN 13.3 > 14
    Fabien Sanglard 2020-10-12T00:00:00+00:00
  1534. Life Harvester #22: The Ramones, H2O, Less Than Jake, 25 Ta Life, Blanks 77, US Bombs, Social Distortion
    Life Harvester 2020-10-15T16:25:48+00:00
  1535. Twist Turn Shoot Burn: A Postmortem
    GAMEPOPPER 2020-10-19T17:16:17+00:00
  1536. On using Markdown with Sphinx - onward to Evennia 0.9.5
    Griatch's Evennia musings 2020-10-19T22:21:00+00:00
  1537. On using Markdown with Sphinx
    Evennia Devblog RSS Feed 2020-10-20T00:00:00+00:00
  1538. Ultima: Through Farthest Lands and Deepest Dungeons
    CRPG Adventures 2020-10-27T08:18:00+00:00
  1539. Newsletter #10 - Neovim v0.4.4
    Neovim 2020-10-28T00:00:00+00:00
  1540. Game Engine Black Book: Wolfenstein 3D, Korean Edition
    Fabien Sanglard 2020-10-30T00:00:00+00:00
  1541. Neon Signs Banana Neon Light Sign Real Glass Neon Sign Neon Lights Neon Wall Sign Real Neon Decorative Light for Home Bedroom Room Decor Bar Office Halloween Party - - Amazon.com
    Terrible Banana 2020-10-31T03:03:32+00:00
  1542. Nearly 60% of registered voters in North Carolina have voted
    Squashed 2020-10-31T17:30:53+00:00
  1543. Ultima: Victory!
    CRPG Adventures 2020-11-03T10:29:00+00:00
  1544. About my keyboard choices
    Winny's Blog 2020-11-04T02:37:00+00:00
  1545. Game 50: Kadath (1979)
    CRPG Adventures 2020-11-08T15:23:00+00:00
  1546. Mafia II – Hat vs. Hair
    Simonschreibt. 2020-11-09T17:05:01+00:00
  1547. Full motion video in ZZT: State of the art
    Posts on asie's blog 2020-11-09T19:51:00+00:00
  1548. These are called opportunities
    Fabien Sanglard 2020-11-12T00:00:00+00:00
  1549. Wenyan-lang
    esoteric.codes 2020-11-12T07:04:00+00:00
  1550. Evennia 0.9.5 released
    Evennia Devblog RSS Feed 2020-11-14T00:00:00+00:00
  1551. Evennia 0.9.5 released!
    Griatch's Evennia musings 2020-11-14T17:46:00+00:00
  1552. Game 51: Local Call for Death (1979)
    CRPG Adventures 2020-11-15T19:01:00+00:00
  1553. Life Harvester #23: Bëëf Stew, Spooky Movies & TV Shows
    Life Harvester 2020-11-16T19:11:04+00:00
  1554. Classical Chinese as a Programming Language
    esoteric.codes 2020-11-23T07:05:00+00:00
  1555. Computing with JS's undefined
    esoteric.codes 2020-11-23T13:12:00+00:00
  1556. 2020 Episode 3
    Verb Your Enthusiasm 2020-11-29T17:44:11+00:00
  1557. Oak
    esoteric.codes 2020-12-01T06:23:00+00:00
  1558. More Font Updates: Oldschool PC Pack, Flexi IBM VGA
    int10h.org - VileR's blog 2020-12-01T20:23:23+00:00
  1559. Recovering data from a corrupted USB thumbdrive using ddrescue
    The Grymoire 2020-12-01T20:58:50+00:00
  1560. Turing Paint
    esoteric.codes 2020-12-14T06:47:00+00:00
  1561. Life Harvester #24: Leaf Piles, Throwing Blueberries At Yogurt, Ask a Shmuck, Miss D's Movie Madness
    Life Harvester 2020-12-14T12:07:02+00:00
  1562. The beautiful silent thunderbolt-3 PC
    Fabien Sanglard 2020-12-22T00:00:00+00:00
  1563. Against essential and accidental complexity
    Dan Luu 2020-12-29T00:00:00+00:00
  1564. Life Harvester #🤷🏻‍♀️: Your Favorite Thing
    Life Harvester 2020-12-30T19:13:30+00:00
  1565. Happy New Years 2021!
    Evennia Devblog RSS Feed 2021-01-01T00:00:00+00:00
  1566. Happy new years 2021! Evennia things to come this year
    Griatch's Evennia musings 2021-01-01T12:38:00+00:00
  1567. xchg rax, rax
    esoteric.codes 2021-01-05T06:17:00+00:00
  1568. untitled
    Squashed 2021-01-08T02:20:10+00:00
  1569. The confusing world of USB
    Fabien Sanglard 2021-01-10T00:00:00+00:00
  1570. About my Medium posts
    LESSIG Blog 2021-01-10T14:29:42+00:00
  1571. MEDIUM: Ted Cruz and Josh Hawley’s illegal objection
    LESSIG Blog 2021-01-10T14:36:24+00:00
  1572. Simulating CRT Monitors with FFmpeg (Pt. 1: Color CRTs)
    int10h.org - VileR's blog 2021-01-10T16:29:21+00:00
  1573. Autopoiesis
    EXO 2021-01-11T05:34:58+00:00
  1574. aftersome
    The Dictionary of Obscure Sorrows 2021-01-11T21:20:27+00:00
  1575. How to Escape the Confines of Time and Space According to the CIA
    EXO 2021-01-13T17:16:06+00:00
  1576. The Geomagnetic Field and Us
    EXO 2021-01-14T15:15:07+00:00
  1577. MEDIUM: Why Senator Hawley’s latest defense is just more offense.
    LESSIG Blog 2021-01-15T18:44:02+00:00
  1578. Stretching The Electric Diamond
    EXO 2021-01-18T19:30:43+00:00
  1579. KFC Mascot Col. Sanders Talks Malbolge Programming on General Hospital—Wait, What?
    esoteric.codes 2021-01-19T04:56:00+00:00
  1580. Testers wanted!
    DeathTrash 2021-01-19T16:27:50+00:00
  1581. Life Harvester #25: Lazy Magnet, It Did Happen Here, Pedrodamus 2021 Trend Forecast
    Life Harvester 2021-01-22T13:28:09+00:00
  1582. DONE: Final words on the Cruz and Hawley outrage
    LESSIG Blog 2021-01-22T16:38:09+00:00
  1583. Oral Argument in PATRICK v. Alaska
    LESSIG Blog 2021-01-22T16:39:22+00:00
  1584. Hey journalists, here’s the question you need to be asking the insurrectionists.
    LESSIG Blog 2021-01-25T16:43:57+00:00
  1585. Simulating CRT Monitors with FFmpeg (Pt. 2: Monochrome CRTs)
    int10h.org - VileR's blog 2021-02-03T21:43:17+00:00
  1586. Interview with 100 Rabbits
    esoteric.codes 2021-02-04T05:10:00+00:00
  1587. A Global Kind of Mood
    EXO 2021-02-05T00:11:44+00:00
  1588. You want sudo -i or su -
    Winny's Blog 2021-02-14T22:21:00+00:00
  1589. Life Harvester #26: Milford Graves, Best Friends, The Big Bagel Question, Miss Soup Pussy
    Life Harvester 2021-02-18T12:35:04+00:00
  1590. Back in Development
    The Ground Gives Way 2021-03-01T16:06:58+00:00
  1591. Interview with David Madore
    esoteric.codes 2021-03-02T06:36:00+00:00
  1592. Trunk Updates 2 March 2021
    Dungeon Crawl Stone Soup 2021-03-02T12:41:16+00:00
  1593. Simulating NON-CRT Monitors with FFmpeg: Flat Panel Displays
    int10h.org - VileR's blog 2021-03-02T22:09:44+00:00
  1594. 🌱⚡The Plant with a Pulse ⚡🌱
    EXO 2021-03-03T01:54:27+00:00
  1595. Priority Adventure 3: Mission: Asteroid (1980)
    CRPG Adventures 2021-03-11T11:28:00+00:00
  1596. ❤️⚓ Flinthook Concept Art: Mr.Blort is a tired fellow. You...
    Tribute Games 2021-03-14T16:01:02+00:00
  1597. Some tips when copying, recovering disks
    Winny's Blog 2021-03-15T00:53:00+00:00
  1598. What Programming Language Would Yoko Ono Create?
    esoteric.codes 2021-03-16T06:44:00+00:00
  1599. Life Harvester 27: I Killed Kurt Cobain, This Shirt Sucks, Some Records I Love, Dykes To Watch Out For
    Life Harvester 2021-03-16T12:01:40+00:00
  1600. Writing Small CLI Programs in Common Lisp
    Steve Losh 2021-03-17T16:10:00+00:00
  1601. Rogue of the Seven Seas – 7DRL Postmortem
    GAMEPOPPER 2021-03-17T18:00:00+00:00
  1602. Where do I begin?
    Evennia Devblog RSS Feed 2021-03-21T00:00:00+00:00
  1603. Where do I begin? (repost)
    Griatch's Evennia musings 2021-03-21T11:59:00+00:00
  1604. Snarf YouTube videos off gather.town
    Winny's Blog 2021-03-22T02:41:40+00:00
  1605. Game 52: Eamon Scenario 2 - The Lair of the Minotaur (1979)
    CRPG Adventures 2021-03-22T12:34:00+00:00
  1606. Trunk Updates 23 March 2021
    Dungeon Crawl Stone Soup 2021-03-23T10:05:04+00:00
  1607. Interview with Jon Corbett
    esoteric.codes 2021-03-30T06:32:00+00:00
  1608. ringlorn
    The Dictionary of Obscure Sorrows 2021-04-02T14:23:23+00:00
  1609. The Matt Gaetz Saga
    Squashed 2021-04-02T14:34:47+00:00
  1610. Trunk Updates 2 April 2021
    Dungeon Crawl Stone Soup 2021-04-02T17:45:09+00:00
  1611. Moving blog to ox-hugo
    Winny's Blog 2021-04-03T06:46:00+00:00
  1612. Game 53: Maces & Magic - Balrog Sampler (1979)
    CRPG Adventures 2021-04-03T10:38:00+00:00
  1613. Game Engine Black Book: DOOM, Korean Edition
    Fabien Sanglard 2021-04-05T00:00:00+00:00
  1614. 'The Gateway' NFT
    EXO 2021-04-07T21:30:18+00:00
  1615. Safe CRT Monitor Shipping: IBM 5153 Makes it through DHL!
    int10h.org - VileR's blog 2021-04-08T11:44:26+00:00
  1616. Found: Page 25 of the CIA’s Gateway Report on Astral Projection
    EXO 2021-04-08T13:00:37+00:00
  1617. Life Harvester 28: RIP Dan Klein, A Character Sketch From A Novel I'm Writing
    Life Harvester 2021-04-15T16:29:51+00:00
  1618. Trunk Updates 19 April 2021
    Dungeon Crawl Stone Soup 2021-04-19T16:32:56+00:00
  1619. Stripe and Solid-State Economics
    The Diff 2021-05-07T13:56:33+00:00
  1620. Balrog Sampler: Near Victory
    CRPG Adventures 2021-05-11T16:06:00+00:00
  1621. Photo
    Terrible Banana 2021-05-11T16:45:32+00:00
  1622. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2021-05-12T22:03:57+00:00
  1623. llvm-reduce
    Embedded in Academia 2021-05-13T16:58:00+00:00
  1624. Lockdown: Day 3,689.G-G on Facebook - G-G on Twitter
    garfield minus garfield 2021-05-14T10:53:53+00:00
  1625. Life Harvester 29: Remix Requests, Hate Your Friends Vs Dream Baby Dream, The Last Time I Did Acid, Writing Letters
    Life Harvester 2021-05-14T12:24:19+00:00
  1626. Observing my cellphone switch towers
    Fabien Sanglard 2021-05-15T00:00:00+00:00
  1627. The Harmonic Grid
    EXO 2021-05-16T15:00:32+00:00
  1628. Buddy Roemer, RIP
    LESSIG Blog 2021-05-18T12:59:00+00:00
  1629. Priority CRPG 4: Wizardry: Proving Grounds of the Mad Overlord (1981)
    CRPG Adventures 2021-05-26T16:07:00+00:00
  1630. Freenode is dead, long live Freenode!
    Winny's Blog 2021-05-27T04:13:05+00:00
  1631. Finished all of netflix.
    garfield minus garfield 2021-05-31T20:28:33+00:00
  1632. Kinda a big announcement
    Joel on Software 2021-06-02T16:36:19+00:00
  1633. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2021-06-06T21:55:57+00:00
  1634. The Apple Compact Unwinding Format: Documented and Explained - Faultlore
    Faultlore 2021-06-09T00:00:00+00:00
  1635. Wizardry: Level One
    CRPG Adventures 2021-06-14T13:07:00+00:00
  1636. Mind Set (2) An interpretation of the delicate balance of...
    Romain Laurent 2021-06-14T19:54:47+00:00
  1637. IRC presence moved to libera.chat
    Dungeon Crawl Stone Soup 2021-06-15T16:06:30+00:00
  1638. Interview with Zzo38
    esoteric.codes 2021-06-21T07:39:00+00:00
  1639. Plugging My Newest Blog
    CRPG Adventures 2021-06-22T05:59:00+00:00
  1640. Life Harvester 30: Elon Musk Meth Conspiracy, Ask A Shmuck, Marc & Olivia's Bidet, Free Palestine, Texan Summer Jams
    Life Harvester 2021-06-25T11:17:00+00:00
  1641. Trunk Updates 25 June 2021
    Dungeon Crawl Stone Soup 2021-06-25T19:16:31+00:00
  1642. Wizardry: Level Two
    CRPG Adventures 2021-06-27T07:01:00+00:00
  1643. )Comfort Zone(
    Romain Laurent 2021-06-27T22:44:19+00:00
  1644. Meet Chuck Easttom
    EXO 2021-06-29T01:35:13+00:00
  1645. Enjoying the view
    Romain Laurent 2021-06-30T01:16:17+00:00
  1646. The EXO Guide to Steganography
    EXO 2021-06-30T16:51:24+00:00
  1647. Trunk Updates 1 July 2021 and Tournament Date Set
    Dungeon Crawl Stone Soup 2021-07-01T21:43:20+00:00
  1648. Wizardry: Level Three
    CRPG Adventures 2021-07-04T08:54:00+00:00
  1649. No need to reinstall your OS
    Winny's Blog 2021-07-09T02:01:42+00:00
  1650. 2020 Prize Episode: Vain Empires
    Verb Your Enthusiasm 2021-07-09T15:57:48+00:00
  1651. Neovim News #11 - The Christmas Issue
    Neovim 2021-07-12T00:00:00+00:00
  1652. Wizardry: Grinding Interlude
    CRPG Adventures 2021-07-12T13:24:00+00:00
  1653. A moment of self reflection
    Romain Laurent 2021-07-12T20:16:46+00:00
  1654. Fat Dactyls
    esoteric.codes 2021-07-13T10:18:00+00:00
  1655. Life Harvester 31: First Ever Poetry Issue feat. Mya Spalter, Thera Webb, Ana Armengod, David Morse
    Life Harvester 2021-07-20T14:06:58+00:00
  1656. A monorepo misconception - atomic cross-project commits
    Juho Snellman's Weblog 2021-07-21T11:00:00+00:00
  1657. Wizardry: Level Four
    CRPG Adventures 2021-07-21T17:30:00+00:00
  1658. In the Movie “Das letzte Land” (D 2019, Dir.: Marcel...
    Source Code in TV and Films 2021-07-22T08:19:48+00:00
  1659. From Guns Akimbo (2019), some generic ffmpeg wrapper Go code,...
    Source Code in TV and Films 2021-07-22T08:19:54+00:00
  1660. From WWE’s Money In The Bank PPV - the “Smackdown...
    Source Code in TV and Films 2021-07-22T08:20:08+00:00
  1661. In Czech movie Vysoká hra there is “high-end" police...
    Source Code in TV and Films 2021-07-22T08:24:17+00:00
  1662. American Gods Season 2
    Source Code in TV and Films 2021-07-22T08:24:25+00:00
  1663. Watching #UploadOnPrime and it’s 2033 and they are in a file...
    Source Code in TV and Films 2021-07-22T08:24:33+00:00
  1664. Futurama Season 1 Episode 9 Basic code to go hell.
    Source Code in TV and Films 2021-07-22T08:24:39+00:00
  1665. A screenshot from the recent UK broadcast of the “Why We...
    Source Code in TV and Films 2021-07-22T08:24:50+00:00
  1666. Photo
    Terrible Banana 2021-07-24T01:06:34+00:00
  1667. July 24 Trunk Update Post and 0.27 Tournament Page
    Dungeon Crawl Stone Soup 2021-07-24T07:37:39+00:00
  1668. Wizardry: Level Five
    CRPG Adventures 2021-07-26T09:33:00+00:00
  1669. 640 Pages in 15 Months
    journal.stuffwithstuff.com 2021-07-29T07:00:00+00:00
  1670. 0.27 “The Cursed Flame”
    Dungeon Crawl Stone Soup 2021-07-30T07:06:42+00:00
  1671. Wizardry: Levels Six to Eight
    CRPG Adventures 2021-08-01T19:19:00+00:00
  1672. Beams of consciousness
    Romain Laurent 2021-08-03T00:15:09+00:00
  1673. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2021-08-04T21:09:52+00:00
  1674. The Dictionary of Obscure Sorrows—the book. 12 years in the...
    The Dictionary of Obscure Sorrows 2021-08-08T12:42:50+00:00
  1675. Wizardry: Level 9 (a tale of two disasters)
    CRPG Adventures 2021-08-09T08:32:00+00:00
  1676. Using an old Supermicro IPMI to configure broken networking
    Winny's Blog 2021-08-10T01:05:00+00:00
  1677. Life Harvester 32: Actual Freaks, Munchin On Crunchy Cukes, Delta Variant, A Picture Of A Poster
    Life Harvester 2021-08-17T11:25:11+00:00
  1678. Coding in Indigenous African Languages
    esoteric.codes 2021-08-18T09:20:00+00:00
  1679. Screenshot from the Departure season 2 episode 1 showing part of...
    Source Code in TV and Films 2021-08-20T07:36:52+00:00
  1680. 0.27 Tournament Results
    Dungeon Crawl Stone Soup 2021-08-20T10:31:29+00:00
  1681. 0.27.1 Bugfix Release
    Dungeon Crawl Stone Soup 2021-08-21T18:45:50+00:00
  1682. Measurement, benchmarking, and data analysis are underrated
    Dan Luu 2021-08-27T00:00:00+00:00
  1683. Trunk Updates 29 August 2021
    Dungeon Crawl Stone Soup 2021-08-29T07:41:14+00:00
  1684. Monkey: the satirical Go package used unwittingly by Arduino and SalesForce
    esoteric.codes 2021-08-30T10:33:00+00:00
  1685. 38: Banking and Clojure with Allen Rohner
    The REPL 2021-08-31T03:30:00+00:00
  1686. Wizardry: Level 10
    CRPG Adventures 2021-09-05T07:45:00+00:00
  1687. 39: Clojure Goes Fast with Alexander Yakushev
    The REPL 2021-09-06T20:00:00+00:00
  1688. Slight magic rework
    The Ground Gives Way 2021-09-08T13:48:45+00:00
  1689. austice
    The Dictionary of Obscure Sorrows 2021-09-09T20:13:53+00:00
  1690. 40: Shipping Clojure code with Paulus Esterhazy
    The REPL 2021-09-13T20:00:00+00:00
  1691. Photo
    Terrible Banana 2021-09-19T15:17:59+00:00
  1692. 41: Clojure pre-history with Chris Houser
    The REPL 2021-09-20T19:08:00+00:00
  1693. Life Harvester 33: Body-ody-ody-ody-ody-ody-ody (Weekend Sensation Journal), Kelly's Turnstile Review
    Life Harvester 2021-09-21T17:31:04+00:00
  1694. Escher Circuits: Using Vision to Perform Computation
    esoteric.codes 2021-09-22T07:29:00+00:00
  1695. Trunk Updates 23 September 2021
    Dungeon Crawl Stone Soup 2021-09-23T16:22:45+00:00
  1696. The value of in-house expertise
    Dan Luu 2021-09-29T00:00:00+00:00
  1697. Censer and Aggravation Rework
    The Ground Gives Way 2021-09-29T21:27:21+00:00
  1698. Bitcoin
    codersnotes.com 2021-10-03T07:00:00+00:00
  1699. 42: Faster JSON parsing with Erik Assum
    The REPL 2021-10-07T08:00:00+00:00
  1700. Some reasons to work on productivity and velocity
    Dan Luu 2021-10-15T00:00:00+00:00
  1701. What to learn
    Dan Luu 2021-10-18T00:00:00+00:00
  1702. Willingness to look stupid
    Dan Luu 2021-10-21T00:00:00+00:00
  1703. 🔮 web3 Is In Our Nature II 🌱
    EXO 2021-10-29T01:35:44+00:00
  1704. 🔮 web3 Is In Our Nature I 🌱
    EXO 2021-10-29T14:00:43+00:00
  1705. Unstable Grounds – Ludum Dare and the Future
    GAMEPOPPER 2021-10-30T10:23:00+00:00
  1706. FATAL FRAME / PROJECT ZERO: Maiden of Black Water Table for Cheat Engine
    Ian Murdock 2021-11-03T02:09:41+00:00
  1707. Persona 5 Table for Cheat Engine
    Ian Murdock 2021-11-03T15:57:09+00:00
  1708. Cyberpunk 2077 Table for Cheat Engine
    Ian Murdock 2021-11-03T17:22:45+00:00
  1709. Prison Simulator Table for Cheat Engine
    Ian Murdock 2021-11-04T23:21:01+00:00
  1710. Forza Horizon 5 Trainer
    Ian Murdock 2021-11-05T22:50:44+00:00
  1711. A Close Look at a Spinlock
    Embedded in Academia 2021-11-06T19:57:06+00:00
  1712. Culture matters
    Dan Luu 2021-11-08T00:00:00+00:00
  1713. Let’s Build a Zoo Cheat Engine Table
    Ian Murdock 2021-11-08T22:09:04+00:00
  1714. Jurassic World Evolution 2 Trainer
    Ian Murdock 2021-11-09T23:13:45+00:00
  1715. Pre-order your copy of “The Dictionary of Obscure Sorrows” from Simon & Schuster:…
    The Dictionary of Obscure Sorrows 2021-11-12T02:05:49+00:00
  1716. PRE-ORDER your copy of “The Dictionary of Obscure Sorrows” here, from Simon & Schuster:…
    The Dictionary of Obscure Sorrows 2021-11-13T02:26:40+00:00
  1717. 43: Clojure, The Essential Reference with Renzo Borgatti
    The REPL 2021-11-13T03:10:33+00:00
  1718. Before CurseForge (Microblog)
    Posts on asie's blog 2021-11-14T10:32:00+00:00
  1719. etterath
    The Dictionary of Obscure Sorrows 2021-11-14T20:02:53+00:00
  1720. Individuals matter
    Dan Luu 2021-11-15T00:00:00+00:00
  1721. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2021-11-15T21:37:20+00:00
  1722. Forza Horizon 5 Cheat Engine Table
    Ian Murdock 2021-11-16T13:28:35+00:00
  1723. This project started here on Tumblr more than 10 years ago. To all my followers, I can’t thank you…
    The Dictionary of Obscure Sorrows 2021-11-16T16:44:47+00:00
  1724. Banana Phone Bluetooth Handset for Cell Phones
    Terrible Banana 2021-11-16T17:07:04+00:00
  1725. CONGRATS for the publication of the book !! I’m so glad this is finally happening ! I just have one question : how is are the contents organized ? Is it like a normal dictionary with entries in alphabetical order, or in order of creation like this tumblr, or something else entirely ?
    The Dictionary of Obscure Sorrows 2021-11-17T17:06:59+00:00
  1726. The blog moved!
    Evennia Devblog RSS Feed 2021-11-18T00:00:00+00:00
  1727. Forza Horizon 5 Cheat Engine {vinny2k}
    Ian Murdock 2021-11-18T13:20:29+00:00
  1728. watashiato
    The Dictionary of Obscure Sorrows 2021-11-18T20:17:16+00:00
  1729. The Evennia blog has moved to evennia.com!
    Griatch's Evennia musings 2021-11-18T21:53:00+00:00
  1730. Major errors on this blog (and their corrections)
    Dan Luu 2021-11-22T00:00:00+00:00
  1731. I never imagined this was possible, but “The Dictionary of Obscure Sorrows” is now a New York Times…
    The Dictionary of Obscure Sorrows 2021-11-25T04:05:40+00:00
  1732. Migrating from Emacs 26 to Emacs 27 on Gentoo
    Winny's Blog 2021-11-28T06:00:00+00:00
  1733. 'Space Covidders' Goes to the Arcade?
    int10h.org - VileR's blog 2021-11-28T20:37:05+00:00
  1734. Thievery rework/rebalancing
    The Ground Gives Way 2021-12-02T13:21:26+00:00
  1735. Some latency measurement pitfalls
    Dan Luu 2021-12-06T00:00:00+00:00
  1736. Trunk Updates 6 December 2021
    Dungeon Crawl Stone Soup 2021-12-06T20:21:04+00:00
  1737. Halo Infinite Trainer
    Ian Murdock 2021-12-10T20:02:00+00:00
  1738. Stanford Professor Garry Nolan Is Analyzing Anomalous Materials From UFO Crashes
    EXO 2021-12-10T21:40:33+00:00
  1739. Some thoughts on writing
    Dan Luu 2021-12-13T00:00:00+00:00
  1740. Melee Weapon Rebalancing
    The Ground Gives Way 2021-12-17T13:28:52+00:00
  1741. The container throttling problem
    Dan Luu 2021-12-18T00:00:00+00:00
  1742. Trunk Updates 18 December 2021
    Dungeon Crawl Stone Soup 2021-12-18T13:55:26+00:00
  1743. Following Street Fighter 2 paper trails
    Fabien Sanglard 2021-12-22T00:00:00+00:00
  1744. Street Fighter 2: The World Warrier
    Fabien Sanglard 2021-12-23T00:00:00+00:00
  1745. Street Fighter 2: Subtile accurate animation
    Fabien Sanglard 2021-12-24T00:00:00+00:00
  1746. Street Fighter 2: Spin when you can't
    Fabien Sanglard 2021-12-24T00:00:00+00:00
  1747. Trunk Updates 28 December 2021
    Dungeon Crawl Stone Soup 2021-12-28T18:08:10+00:00
  1748. Updating The Single Most Influential Book of the BASIC Era
    Coding Horror 2021-12-31T23:49:00+00:00
  1749. ZZT World Creation Contest '91: Allen Pilgrim and Tom Breton's recollections
    Posts on asie's blog 2022-01-04T19:15:00+00:00
  1750. Into 2022 with thanks and plans
    Evennia Devblog RSS Feed 2022-01-06T00:00:00+00:00
  1751. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2022-01-08T11:42:47+00:00
  1752. 11-Streak by Gambler Justice
    The Ground Gives Way 2022-01-11T12:35:36+00:00
  1753. Street Fighter 2: Sound System Internals
    Fabien Sanglard 2022-01-15T00:00:00+00:00
  1754. Trunk Updates 17 January 2022 and Tournament Announcement
    Dungeon Crawl Stone Soup 2022-01-18T04:27:28+00:00
  1755. TGGW v2.6 is out!
    The Ground Gives Way 2022-01-22T11:24:32+00:00
  1756. Destroy All Values: Designing Deinitialization in Programming Languages - Faultlore
    Faultlore 2022-01-23T00:00:00+00:00
  1757. Compiling GhidraNinja’s Pico Debug’N’Dump
    The Grymoire 2022-01-24T18:03:54+00:00
  1758. Making the web better. With blocks!
    Joel on Software 2022-01-27T17:14:00+00:00
  1759. Set up a Private GitLab Runner on Alpine Linux
    Winny's Blog 2022-01-30T05:00:21+00:00
  1760. 0.28 Tournament Page and Schedule
    Dungeon Crawl Stone Soup 2022-02-01T17:31:36+00:00
  1761. A decade of major cache incidents at Twitter
    Dan Luu 2022-02-02T00:00:00+00:00
  1762. Cocktail party ideas
    Dan Luu 2022-02-02T00:00:00+00:00
  1763. Life Harvester 34: I Still Can't Get Any Writing Done So I Reprinted A Joan Didion Essay Without Permission
    Life Harvester 2022-02-03T18:21:30+00:00
  1764. 0.28 “The Rise and Fall of Ignis Zotdust and the Spiders from Hell”
    Dungeon Crawl Stone Soup 2022-02-03T23:34:42+00:00
  1765. plague
    Mighty Vision 2022-02-07T13:01:00+00:00
  1766. plague
    Mighty Vision 2022-02-07T13:01:00+00:00
  1767. Auto-rip Music CDs
    Winny's Blog 2022-02-07T20:52:27+00:00
  1768. Mind Set
    Romain Laurent 2022-02-09T20:22:27+00:00
  1769. The Factorio Mindset
    The Diff 2022-02-11T14:31:56+00:00
  1770. Stupid Dog
    journal.stuffwithstuff.com 2022-02-13T08:00:00+00:00
  1771. But her emails!
    Squashed 2022-02-19T00:39:11+00:00
  1772. CPS-1: GFX system internals
    Fabien Sanglard 2022-02-20T00:00:00+00:00
  1773. Misidentifying talent
    Dan Luu 2022-02-21T00:00:00+00:00
  1774. 0.28 Tournament Results
    Dungeon Crawl Stone Soup 2022-02-21T22:31:37+00:00
  1775. Give me all the PC Engine ports ⊟
    Tiny Cartridge 3DS 2022-02-25T18:01:03+00:00
  1776. Wizardry: Pyrrhic Victory
    CRPG Adventures 2022-02-27T07:42:00+00:00
  1777. Faultlore: Learning Through Errors - Faultlore
    Faultlore 2022-02-27T18:18:56+00:00
  1778. Great news for your PAC-PASSION ⊟
    Tiny Cartridge 3DS 2022-03-02T23:24:41+00:00
  1779. The 2030 Self-Driving Car Bet
    Coding Horror 2022-03-04T18:53:32+00:00
  1780. Trunk updates 6 March 2022
    Dungeon Crawl Stone Soup 2022-03-06T17:12:55+00:00
  1781. Why is it so hard to buy things that work well?
    Dan Luu 2022-03-14T00:00:00+00:00
  1782. C Isn't A Programming Language Anymore - Faultlore
    Faultlore 2022-03-16T21:24:08+00:00
  1783. Rust's Unsafe Pointer Types Need An Overhaul - Faultlore
    Faultlore 2022-03-19T22:13:17+00:00
  1784. DSTs Are Just Polymorphically Compiled Generics - Faultlore
    Faultlore 2022-03-30T22:13:17+00:00
  1785. Taito Milestones out April 15 ⊟
    Tiny Cartridge 3DS 2022-04-05T18:48:34+00:00
  1786. The Tower of Weakenings: Memory Models For Everyone - Faultlore
    Faultlore 2022-04-05T20:07:14+00:00
  1787. In defense of simple architectures
    Dan Luu 2022-04-06T00:00:00+00:00
  1788. Defaults Affect Inference in Rust: Expressions Instead Of Types - Faultlore
    Faultlore 2022-04-10T23:00:13+00:00
  1789. There appears to be some disagreement regarding whether Ukraine sank the Moskva with missiles or, as…
    Squashed 2022-04-14T22:40:40+00:00
  1790. Gotta Protectors is back and a little weirder on Switch ⊟
    Tiny Cartridge 3DS 2022-04-16T13:06:03+00:00
  1791. What if you… listened to Retronauts this week ⊟
    Tiny Cartridge 3DS 2022-04-21T13:49:26+00:00
  1792. Life Harvester 35: May Her Memory Be A Blessing / זיכרונה לברכה / Zikhrona Livrakha, Poems About Grief
    Life Harvester 2022-04-22T19:16:19+00:00
  1793. Neovim News #12 - What's New In Neovim 0.7
    Neovim 2022-04-26T00:00:00+00:00
  1794. My Taipei Quarantine
    Idle Words 2022-04-26T23:12:00+00:00
  1795. Priority Adventure 4: Strange Odyssey (1979)
    CRPG Adventures 2022-05-01T07:49:00+00:00
  1796. USB Cheat Sheet
    Fabien Sanglard 2022-05-05T00:00:00+00:00
  1797. The Beautiful Diablo 2 Resurrected machine
    Fabien Sanglard 2022-05-08T00:00:00+00:00
  1798. Racket on Digital Ocean App Platform
    Winny's Blog 2022-05-15T18:52:27+00:00
  1799. GDC/ADDON 2022: How (not) to create Textures for VFX
    Simonschreibt. 2022-05-22T14:01:32+00:00
  1800. Priority Adventure 5: Mystery Fun House (1979)
    CRPG Adventures 2022-05-24T11:25:00+00:00
  1801. Are you the absolute maniac who will buy Bob with no Bub ⊟
    Tiny Cartridge 3DS 2022-05-30T16:45:15+00:00
  1802. High-Throughput, Formal-Methods-Assisted Fuzzing for LLVM
    Embedded in Academia 2022-05-31T14:56:41+00:00
  1803. Why Build?
    codersnotes.com 2022-06-03T07:00:00+00:00
  1804. About the PS/2 30-286's Hidden VGA Fonts
    int10h.org - VileR's blog 2022-06-05T18:11:12+00:00
  1805. Formal-Methods-Based Bugfinding for LLVM’s AArch64 Backend
    Embedded in Academia 2022-06-06T14:58:02+00:00
  1806. NixOS Migration
    Winny's Blog 2022-06-08T18:12:00+00:00
  1807. The IBM 5153's True CGA Palette and Color Output
    int10h.org - VileR's blog 2022-06-11T00:32:05+00:00
  1808. A match made in the eShop ⊟
    Tiny Cartridge 3DS 2022-06-16T14:20:05+00:00
  1809. Trunk Updates 19 June 2022
    Dungeon Crawl Stone Soup 2022-06-19T14:19:05+00:00
  1810. Save As: DNA 🧬 Part 1
    EXO 2022-06-23T15:33:03+00:00
  1811. Priority Adventure 6: Pyramid of Doom (1979)
    CRPG Adventures 2022-06-29T12:59:00+00:00
  1812. My Famicase Exhibition opening in LA ⊟
    Tiny Cartridge 3DS 2022-06-30T17:47:08+00:00
  1813. Corporate consolidation is good, actually (in this one weird specific case) ⊟
    Tiny Cartridge 3DS 2022-06-30T19:57:37+00:00
  1814. Tutorial-writing and Attributes galore
    Evennia Devblog RSS Feed 2022-07-05T00:00:00+00:00
  1815. Cool DIY Super Famicom kit turned into cooler mini-TV kit ⊟
    Tiny Cartridge 3DS 2022-07-11T20:20:12+00:00
  1816. Priority Adventure 7: Zork: The Great Underground Empire (1980)
    CRPG Adventures 2022-07-17T15:25:00+00:00
  1817. On harm reduction
    Apperceptive by Sam 2022-07-18T13:46:36+00:00
  1818. Trunk Updates 18 July 2022
    Dungeon Crawl Stone Soup 2022-07-19T03:44:42+00:00
  1819. cafe la siesta -8bit edition!!!- 20th Anniversary...
    ⌘+V 2022-07-20T00:38:00+00:00
  1820. Driving is a social process
    Apperceptive by Sam 2022-07-20T17:01:08+00:00
  1821. Romance of the three Kunios today ⊟
    Tiny Cartridge 3DS 2022-07-21T15:11:49+00:00
  1822. The Nightmare Scenario
    Apperceptive by Sam 2022-07-25T15:24:00+00:00
  1823. What autonomous cars see
    Apperceptive by Sam 2022-07-27T18:53:07+00:00
  1824. #lang tinybasic
    Winny's Blog 2022-07-28T02:20:43+00:00
  1825. Understanding Jane Street
    The Diff 2022-08-01T12:39:05+00:00
  1826. The urbanist case for autonomous cars
    Apperceptive by Sam 2022-08-01T19:46:48+00:00
  1827. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2022-08-02T19:17:38+00:00
  1828. Computer vision is badly defined
    Apperceptive by Sam 2022-08-04T12:42:17+00:00
  1829. What if we were actually trying to use technology to make cars safe?
    Apperceptive by Sam 2022-08-08T20:28:44+00:00
  1830. Simon’s Tech Art Learning Materials
    Simonschreibt. 2022-08-13T18:12:26+00:00
  1831. Priority Adventure 8: Ghost Town (1980)
    CRPG Adventures 2022-08-14T13:21:00+00:00
  1832. Trunk Updates 14 August 2022 and Tournament Announcement
    Dungeon Crawl Stone Soup 2022-08-14T21:36:58+00:00
  1833. Publishers old and new bring me games old and old ⊟
    Tiny Cartridge 3DS 2022-08-19T21:23:03+00:00
  1834. Hand-Crafted Artisanal Liquidity Provision
    The Diff 2022-08-22T12:05:04+00:00
  1835. 0.29 “Shooting Stars”
    Dungeon Crawl Stone Soup 2022-08-25T00:42:44+00:00
  1836. There’s a new Puzzle Bobble and it’s very important  ⊟
    Tiny Cartridge 3DS 2022-08-26T00:13:34+00:00
  1837. Depending in Common Lisp
    Steve Losh 2022-08-26T15:15:00+00:00
  1838. Save As: DNA 🧬 Part 2
    EXO 2022-08-30T15:01:05+00:00
  1839. Tons of Mirrors ☀️🛰️🪞🌒⚡
    EXO 2022-09-08T19:24:08+00:00
  1840. What is it all for
    Apperceptive by Sam 2022-09-09T19:21:29+00:00
  1841. Futurist prediction methods and accuracy
    Dan Luu 2022-09-12T00:00:00+00:00
  1842. patreon / videos
    Mighty Vision 2022-09-12T22:25:00+00:00
  1843. patreon / videos
    Mighty Vision 2022-09-12T22:25:00+00:00
  1844. Project plans and Splitting a Setting in two
    Evennia Devblog RSS Feed 2022-09-17T00:00:00+00:00
  1845. 44: Jank with Jeaye Wilkerson
    The REPL 2022-09-17T03:57:33+00:00
  1846. I’m a fool for not already being hyped for SpiderHeck ⊟
    Tiny Cartridge 3DS 2022-09-22T15:44:32+00:00
  1847. It's not clear anybody wants autonomous cars
    Apperceptive by Sam 2022-09-23T16:08:31+00:00
  1848. Compiler Optimizations Are Hard Because They Forget - Faultlore
    Faultlore 2022-09-24T06:06:58+00:00
  1849. CCPS: A CPS-1 SDK
    Fabien Sanglard 2022-09-25T00:00:00+00:00
  1850. The Book Of CP-System
    Fabien Sanglard 2022-09-25T00:00:00+00:00
  1851. Large, Static Website hosting with AWS and Let's Encrypt managed with Terraform
    Winny's Blog 2022-09-30T01:03:23+00:00
  1852. Chat log exhibits from Twitter v. Musk case
    Dan Luu 2022-10-01T00:00:00+00:00
  1853. 45: Data Rabbit with Ryan Robitaille
    The REPL 2022-10-03T20:00:00+00:00
  1854. The story Waze tells about problems with autonomous cars
    Apperceptive by Sam 2022-10-07T20:28:39+00:00
  1855. untitled
    garfield minus garfield 2022-10-13T17:16:47+00:00
  1856. Big news for exactly me: new Arkanoid, from Pastagames
    Tiny Cartridge 3DS 2022-10-27T16:36:33+00:00
  1857. Salvation E13S1 some code described as a never seen encryption...
    Source Code in TV and Films 2022-11-05T13:21:15+00:00
  1858. Screenshot from The Boys, Season 3, Episode 8 showing part of...
    Source Code in TV and Films 2022-11-05T13:21:28+00:00
  1859. From “The Silent Sea”, Season 1, Episode 7,...
    Source Code in TV and Films 2022-11-05T13:21:43+00:00
  1860. From Upgrade (2018), python code with messed up indentation
    Source Code in TV and Films 2022-11-05T13:21:57+00:00
  1861. I just wrote this up:...
    Source Code in TV and Films 2022-11-05T13:23:06+00:00
  1862. Gif like a Pro
    Simonschreibt. 2022-11-13T23:46:16+00:00
  1863. Money, Credit, Trust, and FTX
    The Diff 2022-11-14T13:41:25+00:00
  1864. Terminate Software like a Pro
    Simonschreibt. 2022-11-14T14:55:40+00:00
  1865. Using PureRef as Mini-Photoshop
    Simonschreibt. 2022-11-14T15:10:03+00:00
  1866. Simon’s old VFX
    Simonschreibt. 2022-11-14T15:47:51+00:00
  1867. The Book Of CP-System, paper version
    Fabien Sanglard 2022-11-22T00:00:00+00:00
  1868. Castlevania (DOS - Hercules)
    ⌘+V 2022-11-25T10:54:00+00:00
  1869. Leafcutter ants and orchids (rotate)
    ⌘+V 2022-11-25T10:55:13+00:00
  1870. Medieval Manuscript Fragments in the Classroom
    medievalbooks 2022-11-30T20:04:51+00:00
  1871. Mango Passion Fruit
    Romain Laurent 2022-12-01T17:35:14+00:00
  1872. Evennia 1.0 released!
    Evennia Devblog RSS Feed 2022-12-03T00:00:00+00:00
  1873. ulan-bator:
    ⌘+V 2022-12-05T07:48:18+00:00
  1874. "昔は全世界を一つにつなげるというのを理想にしていたんですが、最近は無理だと思うようになりました。人間、ゆるやかなフィルターバブルの中で生きるのが幸せなんじゃないかって。 ――それはなぜ……? ダンバ..."
    ⌘+V 2022-12-08T11:55:22+00:00
  1875. Transcript of Elon Musk on stage with Dave Chapelle
    Dan Luu 2022-12-11T00:00:00+00:00
  1876. microstat
    One Thing Well 2022-12-14T12:30:31+00:00
  1877. Books update
    Fabien Sanglard 2022-12-15T00:00:00+00:00
  1878. shot-scraper
    One Thing Well 2022-12-15T12:30:26+00:00
  1879. A Linux evening...
    Fabien Sanglard 2022-12-16T00:00:00+00:00
  1880. Finicky
    One Thing Well 2022-12-16T12:30:21+00:00
  1881. Bombadillo
    One Thing Well 2022-12-17T12:30:20+00:00
  1882. s
    One Thing Well 2022-12-18T12:30:25+00:00
  1883. SketchyBar
    One Thing Well 2022-12-19T12:00:22+00:00
  1884. Progress on the Block Protocol
    Joel on Software 2022-12-19T13:01:40+00:00
  1885. Happy Net Box
    One Thing Well 2022-12-20T12:00:22+00:00
  1886. Rclone
    One Thing Well 2022-12-21T12:00:22+00:00
  1887. Companion apps for Apple Music
    One Thing Well 2022-12-22T12:00:19+00:00
  1888. podget
    One Thing Well 2022-12-23T12:01:33+00:00
  1889. 46: ClojureDart with Christophe Grand and Baptiste Dupuch
    The REPL 2022-12-23T21:00:00+00:00
  1890. Osmosis S1E2, nanobot programming contains … Singleton...
    Source Code in TV and Films 2022-12-24T15:34:39+00:00
  1891. Newsboat
    One Thing Well 2022-12-29T12:00:19+00:00
  1892. Smol Pub
    One Thing Well 2022-12-30T12:00:24+00:00
  1893. What Neovim shipped in 2022
    Neovim 2022-12-31T00:00:00+00:00
  1894. Why Not Mars
    Idle Words 2023-01-01T23:12:00+00:00
  1895. Type Checking If Expressions
    journal.stuffwithstuff.com 2023-01-03T08:00:00+00:00
  1896. gum
    One Thing Well 2023-01-06T13:05:26+00:00
  1897. New computer checklist
    Winny's Blog 2023-01-09T06:00:00+00:00
  1898. nom
    One Thing Well 2023-01-10T12:00:44+00:00
  1899. Heatwave
    One Thing Well 2023-01-11T12:00:22+00:00
  1900. Trunk Updates 11 Jan 2023
    Dungeon Crawl Stone Soup 2023-01-11T18:46:49+00:00
  1901. 47: Executable textbooks with Sam Ritchie
    The REPL 2023-01-12T14:20:00+00:00
  1902. Cilicon
    One Thing Well 2023-01-12T15:35:33+00:00
  1903. yt-dlp
    One Thing Well 2023-01-13T12:00:33+00:00
  1904. One Thing
    One Thing Well 2023-01-18T12:00:44+00:00
  1905. Phi Chay Thai Cuisine – St. Paul, MN
    You Care What We Think 2023-01-19T16:30:00+00:00
  1906. FrogFind
    One Thing Well 2023-01-20T12:00:30+00:00
  1907. Mjolnir
    Fabien Sanglard 2023-01-23T00:00:00+00:00
  1908. Linky
    One Thing Well 2023-01-25T12:00:27+00:00
  1909. CGTC IV Talks, Day 2
    Combinatorial Game Theory 2023-01-25T18:18:00+00:00
  1910. CGTC IV Talks, Day 3
    Combinatorial Game Theory 2023-01-25T23:28:00+00:00
  1911. Quick Review: Nova Bar - Hudson, WI
    You Care What We Think 2023-01-26T17:00:00+00:00
  1912. Wonder Boy, Bust a Move, New Zealand Story today⊟
    Tiny Cartridge 3DS 2023-02-02T15:59:28+00:00
  1913. Croft Kitchen - Crosby, MN
    You Care What We Think 2023-02-02T16:30:00+00:00
  1914. Can 4GiB meet your needs in 2023?
    Winny's Blog 2023-02-07T00:00:00+00:00
  1915. Life Harvester 36: February 2023 as December 2021, Astrology as Personal Ads
    Life Harvester 2023-02-08T18:36:41+00:00
  1916. Design for democracy - pro bono?
    LESSIG Blog 2023-02-09T13:15:53+00:00
  1917. The Rib Co. - Twentynine Palms, CA
    You Care What We Think 2023-02-09T17:00:00+00:00
  1918. Coming soon
    Lefineder’s Substack 2023-02-12T18:08:38+00:00
  1919. Criminals ViewS Crime
    Lefineder’s Substack 2023-02-13T00:27:25+00:00
  1920. Introducing BootFriend: unofficial custom firmware for WonderSwan Color
    Posts on asie's blog 2023-02-15T23:15:00+00:00
  1921. LLMs, anthropocentric thinking, accuracy, and self-driving
    Apperceptive by Sam 2023-02-16T15:01:42+00:00
  1922. Apostle Supper Club – St .Paul, MN
    You Care What We Think 2023-02-16T17:00:00+00:00
  1923. Don't tase men, bro!
    Lefineder’s Substack 2023-02-19T00:53:52+00:00
  1924. The Unreal Stencil Dragon
    Simonschreibt. 2023-02-20T10:18:02+00:00
  1925. All you may need is HTML
    Fabien Sanglard 2023-03-02T00:00:00+00:00
  1926. AI as UX
    Apperceptive by Sam 2023-03-02T13:26:33+00:00
  1927. Status
    Emily Short's Interactive Storytelling 2023-03-03T00:01:59+00:00
  1928. Pre-commit in GitHub Actions & GitLab CI
    Winny's Blog 2023-03-09T10:00:00+00:00
  1929. Crime rise; more or the same.
    Lefineder’s Substack 2023-03-10T23:05:31+00:00
  1930. Self-serving thought experiments
    Apperceptive by Sam 2023-03-15T12:58:24+00:00
  1931. Old Chips, New Glitches: the CGA/CRTC "Phantom" VSync
    int10h.org - VileR's blog 2023-03-21T07:18:31+00:00
  1932. Patriotism or Prestige
    Lefineder’s Substack 2023-03-23T00:00:30+00:00
  1933. MyHouse.wad
    Terry's Free Game of the Week 2023-03-24T07:37:35+00:00
  1934. 100 win-streak by GJ
    The Ground Gives Way 2023-03-26T14:16:06+00:00
  1935. Regular Home Renovation Simulator
    Terry's Free Game of the Week 2023-03-30T03:51:52+00:00
  1936. The Joy of Computer History Books
    Fabien Sanglard 2023-04-01T00:00:00+00:00
  1937. Trunk Updates 1 April 2023 and Tournament Announcement
    Dungeon Crawl Stone Soup 2023-04-01T05:56:10+00:00
  1938. Sprouts 2023 Talks
    Combinatorial Game Theory 2023-04-02T02:21:00+00:00
  1939. Magpie
    Terry's Free Game of the Week 2023-04-06T13:34:10+00:00
  1940. The Father of Home Video Games
    Ironic Sans 2023-04-11T15:55:49+00:00
  1941. Why Janet?
    Ian Henry 2023-04-12T00:00:00+00:00
  1942. Knowing how to measure
    Apperceptive by Sam 2023-04-14T12:16:09+00:00
  1943. Sylvie wasn’t feeling well today so she went on an adventure to meet different kittens
    Terry's Free Game of the Week 2023-04-14T15:15:32+00:00
  1944. Generalized Macros
    Ian Henry 2023-04-18T00:00:00+00:00
  1945. Joey Wamone’s Normal Bedtime Routine That Is Absolutely Not A Recurring Tooth Decay Nightmare
    Terry's Free Game of the Week 2023-04-20T20:04:42+00:00
  1946. More CGA CRTC Glitching: HD6845(R) vs. MC6845
    int10h.org - VileR's blog 2023-04-22T09:36:31+00:00
  1947. Darth Vader answers the Proust Questionnaire
    Ironic Sans 2023-04-25T15:55:13+00:00
  1948. Battlefront II: Layered Explosion
    Simonschreibt. 2023-04-25T19:39:38+00:00
  1949. TRAUMAKT~4.SEXE
    Terry's Free Game of the Week 2023-04-28T08:51:28+00:00
  1950. The Four Vertex Volume
    Simonschreibt. 2023-04-30T16:36:18+00:00
  1951. Driving Compilers
    Fabien Sanglard 2023-05-03T00:00:00+00:00
  1952. Letters To The Editor
    Fujichia 2023-05-03T14:52:18+00:00
  1953. Combat Mode
    The Ground Gives Way 2023-05-03T17:56:35+00:00
  1954. Japanese Money Simulator
    Terry's Free Game of the Week 2023-05-04T09:05:03+00:00
  1955. Deserve’s Got Nothing To Do With It
    The Popehat Report 2023-05-05T14:54:27+00:00
  1956. 0.30: “The Reavers Return”
    Dungeon Crawl Stone Soup 2023-05-05T16:55:51+00:00
  1957. IREM game collections for me to collect ⊟
    Tiny Cartridge 3DS 2023-05-05T19:05:40+00:00
  1958. Is artificial intelligence as a term per se racist?
    Apperceptive by Sam 2023-05-05T19:37:19+00:00
  1959. The Divine Fire
    Simonschreibt. 2023-05-05T22:34:32+00:00
  1960. Pokémon – Rapidash
    Simonschreibt. 2023-05-07T20:05:37+00:00
  1961. It’s a Media Roundup!
    Ironic Sans 2023-05-09T15:55:09+00:00
  1962. The wait for Gekisou! Benza Race - Toilet Shooting Star is almost over
    Tiny Cartridge 3DS 2023-05-11T13:24:50+00:00
  1963. REALITY_ENDS S1
    Terry's Free Game of the Week 2023-05-11T21:45:58+00:00
  1964. Zaga-33 reborn
    Mighty Vision 2023-05-13T12:25:00+00:00
  1965. Zaga-33 reborn
    Mighty Vision 2023-05-13T12:25:00+00:00
  1966. Green New Deal Simulator
    Terry's Free Game of the Week 2023-05-18T17:52:51+00:00
  1967. Sillypaste migrated to fly.io
    Winny's Blog 2023-05-19T00:30:00+00:00
  1968. 0.30 Tournament Results
    Dungeon Crawl Stone Soup 2023-05-22T04:02:15+00:00
  1969. Jedi: Fallen Order – Splishy Splashy
    Simonschreibt. 2023-05-22T17:18:57+00:00
  1970. I Get No Mail and It’s Glorious
    Ironic Sans 2023-05-23T15:55:49+00:00
  1971. Special Delivery
    Terry's Free Game of the Week 2023-05-24T19:23:33+00:00
  1972. Inequality, galactic and planetary
    Lefineder’s Substack 2023-05-26T00:49:26+00:00
  1973. Deus Ex – Alpha Terrain
    Simonschreibt. 2023-05-28T12:36:14+00:00
  1974. Speech or Cancel Culture At Boston University?
    The Popehat Report 2023-05-31T17:55:05+00:00
  1975. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2023-05-31T21:10:15+00:00
  1976. The Logic of Baseball
    Ironic Sans 2023-06-06T15:55:54+00:00
  1977. My first game jam
    Winny's Blog 2023-06-08T08:30:00+00:00
  1978. I’m coming around to the view that this is, in fact, The Greatest Witch Hunt of All Time. The other…
    Squashed 2023-06-09T18:20:00+00:00
  1979. Evennia 2.0.0 released today
    Evennia Devblog RSS Feed 2023-06-10T00:00:00+00:00
  1980. Jack Smith, Donald Trump, and the Kobayashi Maru
    The Popehat Report 2023-06-10T18:54:28+00:00
  1981. From Danni Storm‘s exhibition W(ord)s & Weavings, which just finished in Copenhagen. Typed on a…
    ⌘+V 2023-06-11T02:31:04+00:00
  1982. GTA V – Underestimated Glow
    Simonschreibt. 2023-06-11T19:47:24+00:00
  1983. That's Not How Recusal Works, That's Not How Any Of This Works!
    The Popehat Report 2023-06-13T18:17:44+00:00
  1984. Anti-Personas
    ignorethecode.net 2023-06-13T21:51:24+00:00
  1985. Thanks @screenrant
    garfield minus garfield 2023-06-14T20:19:09+00:00
  1986. snjmrkm: (via スペースインベーダー45周年...
    ⌘+V 2023-06-15T01:59:07+00:00
  1987. Typewriter work by Dirk Krecker, 2014.
    ⌘+V 2023-06-15T01:59:29+00:00
  1988. garadinervi: Heinz Kroehl, Kroehl – Images, Landesmuseum Mainz,...
    ⌘+V 2023-06-15T01:59:57+00:00
  1989. text-mode: By Siggi Eggertsson, 2014.
    ⌘+V 2023-06-15T02:01:23+00:00
  1990. By Mark Webster.
    ⌘+V 2023-06-15T02:01:27+00:00
  1991. By Mark Webster.
    ⌘+V 2023-06-15T02:01:32+00:00
  1992. worldsofzzt: Source“Rotten Robots 2: Revenge of SID” by Caspar...
    ⌘+V 2023-06-15T02:01:49+00:00
  1993. Faun. PETSCII by Electric, 2023.
    ⌘+V 2023-06-15T02:01:53+00:00
  1994. moji: japanese matchbox labels
    ⌘+V 2023-06-15T02:02:15+00:00
  1995. worldsofzzt: Source“Castaway” by Unknown (1996) [CASTAWAY.ZZT]...
    ⌘+V 2023-06-15T02:02:50+00:00
  1996. worldsofzzt: Source“Myst Portal” by Chefchen HK...
    ⌘+V 2023-06-15T02:03:14+00:00
  1997. text-mode: Dirk Krecker
    ⌘+V 2023-06-15T02:03:28+00:00
  1998. Racket frustrates me
    Winny's Blog 2023-06-16T08:30:00+00:00
  1999. Dead Man's Isle - Astoria, OR
    You Care What We Think 2023-06-16T15:50:00+00:00
  2000. Good Vibrations
    Fabien Sanglard 2023-06-17T00:00:00+00:00
  2001. Sinister strike
    Lefineder’s Substack 2023-06-18T19:00:47+00:00
  2002. The Segway Inventor and His Comic Book Father
    Ironic Sans 2023-06-20T15:55:09+00:00
  2003. Cannon Beach Hardware & Public House - Cannon Beach, OR
    You Care What We Think 2023-06-20T16:49:00+00:00
  2004. Streak Redemption
    ignorethecode.net 2023-06-26T20:30:44+00:00
  2005. Supreme Court Clarifies "True Threats" First Amendment Exception
    The Popehat Report 2023-06-27T19:35:46+00:00
  2006. 50 vogels – a project to print 50 birds each with 16×16 LEGO pieces. Made by Roy Scholten and…
    ⌘+V 2023-06-28T07:27:19+00:00
  2007. A Light Melancholy
    Romain Laurent 2023-06-29T21:39:22+00:00
  2008. The Story of The First Software Patent
    Ironic Sans 2023-07-04T15:55:58+00:00
  2009. My Kind of REPL
    Ian Henry 2023-07-05T00:00:00+00:00
  2010. Raising the Bar for IBM PC/XT Emulation: MartyPC
    int10h.org - VileR's blog 2023-07-05T14:37:21+00:00
  2011. > 177: Me claiming I could fix it
    Laura Olin 2023-07-13T08:00:00+00:00
  2012. Tricking Monty Hall
    ignorethecode.net 2023-07-15T11:20:38+00:00
  2013. The Finite Faculties of Man
    Lefineder’s Substack 2023-07-16T16:11:35+00:00
  2014. 10NES
    Fabien Sanglard 2023-07-18T00:00:00+00:00
  2015. 3D Gaming Before VR
    Ironic Sans 2023-07-18T15:55:03+00:00
  2016. When did people stop being drunk all the time?
    Lefineder’s Substack 2023-07-18T19:38:38+00:00
  2017. Carts of Carnage
    Lefineder’s Substack 2023-07-19T19:18:08+00:00
  2018. "お兄ちゃんがガムを2つに割って、弟に「はい」と分けたりする、その行為が大事なんです。最初から分けてあったら、それはできませんから。"
    ⌘+V 2023-07-20T04:57:12+00:00
  2019. The Meat of Man the Hunter
    Lefineder’s Substack 2023-07-21T23:59:03+00:00
  2020. Web Environment Integrity vs. Private Access Tokens - They're the same thing!
    Juho Snellman's Weblog 2023-07-25T18:30:00+00:00
  2021. Helpful and unhelpful anthropomorphism
    Apperceptive by Sam 2023-07-26T16:42:13+00:00
  2022. Hunter Biden And The Fog Of War
    The Popehat Report 2023-07-26T21:04:34+00:00
  2023. Commander Keen: Adaptive Tile Scrolling
    Fabien Sanglard 2023-07-27T00:00:00+00:00
  2024. Wonderful Toolchain project update - July 2023
    Posts on asie's blog 2023-07-30T00:00:00+00:00
  2025. The Fibonacci Matrix
    Ian Henry 2023-07-30T00:00:00+00:00
  2026. The value-destroying potential of AI
    Apperceptive by Sam 2023-07-31T14:37:29+00:00
  2027. Rethinking Window Management
    ignorethecode.net 2023-07-31T20:28:26+00:00
  2028. Is Following An Extradition Treaty An Elaborate Political Conspiracy?
    The Popehat Report 2023-08-01T00:11:53+00:00
  2029. New Evidence in 100-Year-Old Claim of Amateurs Accomplishing What Experts Couldn’t
    Ironic Sans 2023-08-01T15:55:54+00:00
  2030. People Are Lying To You About The Trump Indictment
    The Popehat Report 2023-08-02T17:17:00+00:00
  2031. Representing Heterogeneous Data
    journal.stuffwithstuff.com 2023-08-04T07:00:00+00:00
  2032. Nix / NixOS misconceptions
    Winny's Blog 2023-08-06T05:00:00+00:00
  2033. Beware The Flood Of Trump Sentencing Disinformation
    The Popehat Report 2023-08-06T21:31:29+00:00
  2034. The National Review Is Still Lying To You About The Fraud Charge Against Trump
    The Popehat Report 2023-08-08T01:37:37+00:00
  2035. Vim Boss
    Neovim 2023-08-09T00:00:00+00:00
  2036. Understanding (and) psychology
    Apperceptive by Sam 2023-08-09T14:29:28+00:00
  2037. The Weight Of The Unspoken Word
    The Popehat Report 2023-08-10T19:30:06+00:00
  2038. mDNS Primer
    Fabien Sanglard 2023-08-11T00:00:00+00:00
  2039. Ode to the M1
    Fabien Sanglard 2023-08-12T00:00:00+00:00
  2040. More Calories Less Crime
    Lefineder’s Substack 2023-08-12T22:02:32+00:00
  2041. The Magician, The Artist & The Mathematician
    Ironic Sans 2023-08-15T15:55:59+00:00
  2042. Overt Acts and Predicate Acts, Explained
    The Popehat Report 2023-08-17T16:39:42+00:00
  2043. Browsing the web with a WonderSwan in 2023
    Posts on asie's blog 2023-08-19T11:20:00+00:00
  2044. > 178: The footing is ambiguous
    Laura Olin 2023-08-24T13:12:58+00:00
  2045. 2023 Minnesota State Fair - St. Paul, MN
    You Care What We Think 2023-08-25T02:08:00+00:00
  2046. Wonderful Toolchain project update - August 2023
    Posts on asie's blog 2023-08-27T00:00:00+00:00
  2047. The Great Emu War
    Ironic Sans 2023-08-29T15:55:52+00:00
  2048. End of August Links
    Emily Short's Interactive Storytelling 2023-08-31T14:57:41+00:00
  2049. I have such a migraine
    Ironic Sans 2023-09-12T15:55:11+00:00
  2050. 48: Biff with Jacob O'Bryant
    The REPL 2023-09-16T02:08:06+00:00
  2051. The cost we bear
    Apperceptive by Sam 2023-09-22T15:31:29+00:00
  2052. Make Your Own Mon-Yu ⊟
    Tiny Cartridge 3DS 2023-09-22T21:32:56+00:00
  2053. Exploring Command-line space time
    Fabien Sanglard 2023-09-26T00:00:00+00:00
  2054. Where “Matrix Ping Pong” Came From
    Ironic Sans 2023-09-26T15:55:10+00:00
  2055. > 179: The age of divestment
    Laura Olin 2023-10-05T12:30:25+00:00
  2056. Tori (Ramen) - St. Paul, MN
    You Care What We Think 2023-10-06T14:24:00+00:00
  2057. Forty years of programming
    Fabien Sanglard 2023-10-08T00:00:00+00:00
  2058. I for one at looking forward to the release of that gender-flipped Rise and Fall of the Roman…
    Squashed 2023-10-08T00:26:44+00:00
  2059. I Can’t Believe The Navy Gave Me So Much Access
    Ironic Sans 2023-10-10T15:55:09+00:00
  2060. NeXt...
    Source Code in TV and Films 2023-10-11T16:02:29+00:00
  2061. NeXt (2020 series) I know I already shared something from this...
    Source Code in TV and Films 2023-10-11T16:02:46+00:00
  2062. Big Duck Energy
    Wild Information 2023-10-15T14:01:17+00:00
  2063. Does Go Have Subtyping?
    journal.stuffwithstuff.com 2023-10-19T07:00:00+00:00
  2064. Astro-Dodge's Dirty Video Tricks
    int10h.org - VileR's blog 2023-10-20T17:35:24+00:00
  2065. Why I’m Still Not Sick of ChatGPT
    Ironic Sans 2023-10-24T15:55:16+00:00
  2066. Sway review
    Winny's Blog 2023-10-26T05:00:00+00:00
  2067. To Boldly Go Down The Hall
    Wild Information 2023-10-26T22:14:01+00:00
  2068. I knew this was coming
    Apperceptive by Sam 2023-10-27T16:29:26+00:00
  2069. Why Do Peephole Optimizations Work?
    Embedded in Academia 2023-11-01T16:23:20+00:00
  2070. > 180: You want to see my hands?
    Laura Olin 2023-11-02T11:41:40+00:00
  2071. How the field of "AI" got like this
    Apperceptive by Sam 2023-11-02T14:40:06+00:00
  2072. 0x4 reasons to write and publish
    Fabien Sanglard 2023-11-07T00:00:00+00:00
  2073. A Celebrity in Every Taxi
    Ironic Sans 2023-11-07T16:55:17+00:00
  2074. The bash book to rule them all
    Fabien Sanglard 2023-11-08T00:00:00+00:00
  2075. Fear of Trees
    Wild Information 2023-11-12T16:00:31+00:00
  2076. Moving from Team17
    GAMEPOPPER 2023-11-13T16:10:12+00:00
  2077. My Free Speech Means You Have To Shut Up
    The Popehat Report 2023-11-20T03:11:56+00:00
  2078. Have You Heard About Montana?!
    Ironic Sans 2023-11-21T16:55:33+00:00
  2079. In Which I Repent On Free Speech Culture
    The Popehat Report 2023-11-22T01:42:44+00:00
  2080. Dialogue Expressiveness in Mask of the Rose
    Emily Short's Interactive Storytelling 2023-11-22T17:51:48+00:00
  2081. How Apple's Pro Display XDR takes Thunderbolt 3 to its limit
    Fabien Sanglard 2023-11-23T00:00:00+00:00
  2082. untitled
    EV NY: 30 yrs and now 2023-11-26T13:25:09+00:00
  2083. “Upon seeing Daniel Root’s photographs of Manhattan’s downtown bars, I was immediately taken by the…
    EV NY: 30 yrs and now 2023-11-26T13:28:41+00:00
  2084. “Daniel Root’s photos of New York bars at dawn are a perfect blend of beauty and melancholia….
    EV NY: 30 yrs and now 2023-11-26T13:29:33+00:00
  2085. “I’ve always said the very best bars are inviting, whether packed or empty. Daniel Root’s amazing…
    EV NY: 30 yrs and now 2023-11-26T13:30:48+00:00
  2086. Forty years in NYC, in those very neighborhoods, having seen countless NYC pictures and yet here was…
    EV NY: 30 yrs and now 2023-11-26T13:32:00+00:00
  2087. “New York bars at Dawn”
    EV NY: 30 yrs and now 2023-11-26T13:34:51+00:00
  2088. “New York Bars at Dawn”
    EV NY: 30 yrs and now 2023-11-26T13:36:36+00:00
  2089. “New York Bars at Dawn”
    EV NY: 30 yrs and now 2023-11-26T13:38:36+00:00
  2090. “New York Bars at Dawn”
    EV NY: 30 yrs and now 2023-11-26T13:39:52+00:00
  2091. “New York Bars at Dawn”
    EV NY: 30 yrs and now 2023-11-26T13:41:32+00:00
  2092. untitled
    EV NY: 30 yrs and now 2023-11-26T13:53:10+00:00
  2093. untitled
    EV NY: 30 yrs and now 2023-11-26T13:53:49+00:00
  2094. untitled
    EV NY: 30 yrs and now 2023-11-26T13:54:26+00:00
  2095. untitled
    EV NY: 30 yrs and now 2023-11-26T13:54:55+00:00
  2096. untitled
    EV NY: 30 yrs and now 2023-11-26T13:55:30+00:00
  2097. untitled
    EV NY: 30 yrs and now 2023-11-26T13:56:21+00:00
  2098. untitled
    EV NY: 30 yrs and now 2023-11-26T13:57:04+00:00
  2099. untitled
    EV NY: 30 yrs and now 2023-11-26T13:57:43+00:00
  2100. untitled
    EV NY: 30 yrs and now 2023-11-26T13:58:38+00:00
  2101. untitled
    EV NY: 30 yrs and now 2023-11-26T13:59:27+00:00
  2102. Punishment Envy And The Perils Of Institutional Engagement
    The Popehat Report 2023-11-28T22:32:42+00:00
  2103. I Fight For The Users
    Coding Horror 2023-11-30T20:11:05+00:00
  2104. Living in a Lucid Dream
    Wild Information 2023-11-30T23:14:52+00:00
  2105. Suit Viewing Opportunities
    Fujichia 2023-12-05T15:21:03+00:00
  2106. Gift Guide For Fictional Characters
    Ironic Sans 2023-12-05T16:55:21+00:00
  2107. Stop Demanding Dumb Answers To Hard Questions
    The Popehat Report 2023-12-07T17:36:36+00:00
  2108. The emptiness at the heart of emotion recognition
    Apperceptive by Sam 2023-12-08T13:41:24+00:00
  2109. Trunk Updates 11 December 2023 and Tournament Announcement
    Dungeon Crawl Stone Soup 2023-12-11T20:05:28+00:00
  2110. Cup of Coffee: December 18, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-18T11:10:18+00:00
  2111. Cup of Coffee: December 19, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-19T11:10:34+00:00
  2112. CERN
    ntoll.org 2023-12-19T13:30:00+00:00
  2113. How Are You? Just Give Me Your Stock Answer.
    Ironic Sans 2023-12-19T16:55:39+00:00
  2114. Cup of Coffee: December 20, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-20T11:10:49+00:00
  2115. My 2023 in review
    Winny's Blog 2023-12-21T06:00:00+00:00
  2116. Cup of Coffee: December 21, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-21T11:10:34+00:00
  2117. UK tv series called COBRA: Cyberwar staring Robert Carlyle - I...
    Source Code in TV and Films 2023-12-21T18:30:35+00:00
  2118. Didn’t know if this blog was still a thing, but here it is!...
    Source Code in TV and Films 2023-12-21T18:30:38+00:00
  2119. Substack's response to Substackers against Nazis sucks
    Cup of Coffee by Craig Calcaterra 2023-12-21T19:37:19+00:00
  2120. Substack Has A Nazi Opportunity
    The Popehat Report 2023-12-21T20:25:33+00:00
  2121. Cup of Coffee: December 22, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-22T11:10:31+00:00
  2122. Cup of Coffee: Merry Christmas!
    Cup of Coffee by Craig Calcaterra 2023-12-25T11:41:38+00:00
  2123. Cup of Coffee: December 27, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-27T11:10:26+00:00
  2124. Cup of Coffee: December 28, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-28T11:10:54+00:00
  2125. May A Public University Fire Its Chancellor For Appearing In Porn Videos On His Own Time?
    The Popehat Report 2023-12-28T23:38:26+00:00
  2126. Cup of Coffee: December 29, 2023
    Cup of Coffee by Craig Calcaterra 2023-12-29T11:10:19+00:00
  2127. 49: Clerk with Martin Kavalar
    The REPL 2023-12-29T21:23:50+00:00
  2128. How bad are search results? Let's compare Google, Bing, Marginalia, Kagi, Mwmbl, and ChatGPT
    Dan Luu 2023-12-30T00:00:00+00:00
  2129. Why Android developers no longer need Windows USB drivers
    Fabien Sanglard 2023-12-30T00:00:00+00:00
  2130. The Time of Big Walking
    Wild Information 2023-12-31T21:21:38+00:00
  2131. Full UI Upscaling, Part 1: History and Theory
    Grid Sage Games 2024-01-02T04:04:36+00:00
  2132. Cup of Coffee: January 2, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-02T11:10:30+00:00
  2133. Upgrading my Workstation to NixOS 23.11
    Winny's Blog 2024-01-03T06:00:00+00:00
  2134. Cup of Coffee: January 3, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-03T11:10:08+00:00
  2135. Cup of Coffee: January 4, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-04T11:10:47+00:00
  2136. Cup of Coffee: January 5, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-05T11:10:30+00:00
  2137. Full UI Upscaling, Part 2: Holy Mockups!
    Grid Sage Games 2024-01-05T13:50:40+00:00
  2138. How LLMs are and are not like the brain
    Apperceptive by Sam 2024-01-05T16:57:51+00:00
  2139. Heraclitus: The Unity of Opposites
    ntoll.org 2024-01-07T11:30:00+00:00
  2140. Cup of Coffee: January 8, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-08T11:10:21+00:00
  2141. Moving to Ghost 👻
    Ironic Sans 2024-01-08T19:41:06+00:00
  2142. Cup of Coffee: January 9, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-09T11:10:23+00:00
  2143. Multiple arguments in shebang
    Winny's Blog 2024-01-10T06:00:00+00:00
  2144. Cup of Coffee: January 10, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-10T11:11:00+00:00
  2145. Cup of Coffee: January 11, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-11T11:10:16+00:00
  2146. > 181: It has taken all our strength
    Laura Olin 2024-01-11T12:41:25+00:00
  2147. Full UI Upscaling, Part 3: Dynamic Terminal Swapping
    Grid Sage Games 2024-01-12T06:31:24+00:00
  2148. Cup of Coffee: January 12, 2024
    Cup of Coffee by Craig Calcaterra 2024-01-12T11:10:25+00:00
  2149. Win A Dream Date With A Litigious Douchebag!
    The Popehat Report 2024-01-12T17:27:24+00:00
  2150. Cup of Coffee has moved to Beehiiv
    Cup of Coffee by Craig Calcaterra 2024-01-14T22:02:26+00:00
  2151. Another NixOS 23.11 upgrade gotcha
    Winny's Blog 2024-01-15T06:00:00+00:00
  2152. 0.31 Tournament Page
    Dungeon Crawl Stone Soup 2024-01-17T03:32:35+00:00
  2153. Full UI Upscaling, Part 4: Simpler Lightweight Fonts
    Grid Sage Games 2024-01-19T05:34:15+00:00
  2154. 0.31 “The Alchemy of Forms”
    Dungeon Crawl Stone Soup 2024-01-19T06:36:33+00:00
  2155. Destructive investing and the siren song of software
    Apperceptive by Sam 2024-01-19T21:07:06+00:00
  2156. How the DevTeam conquered the iPhone
    Fabien Sanglard 2024-01-21T00:00:00+00:00
  2157. Games at Mumbai, Day 0 Talks
    Combinatorial Game Theory 2024-01-22T04:02:00+00:00
  2158. Games at Mumbai, Day 1 Talks
    Combinatorial Game Theory 2024-01-22T14:15:00+00:00
  2159. Games at Mumbai, Day 2 Talks
    Combinatorial Game Theory 2024-01-23T12:51:00+00:00
  2160. Games at Mumbai, Day 3 Talks
    Combinatorial Game Theory 2024-01-24T13:20:00+00:00
  2161. Why do people post on [bad platform] instead of [good platform]?
    Dan Luu 2024-01-25T00:00:00+00:00
  2162. Games at Mumbai, Day 4 (Final) of Talks
    Combinatorial Game Theory 2024-01-25T11:41:00+00:00
  2163. Games at Mumbai: Not the Talks
    Combinatorial Game Theory 2024-01-25T12:44:00+00:00
  2164. How to Learn Nix, Part 48: Installing (single-user) Nix on macOS
    Ian Henry 2024-01-26T00:00:00+00:00
  2165. Test your backups
    Winny's Blog 2024-01-27T06:00:00+00:00
  2166. How to Learn Nix, Part 49: nix-direnv is a huge quality of life improvement
    Ian Henry 2024-01-28T00:00:00+00:00
  2167. Notes on Cruise's pedestrian accident
    Dan Luu 2024-01-29T00:00:00+00:00
  2168. The Popehat Report Is Moving To Beehiiv
    The Popehat Report 2024-01-30T22:32:46+00:00
  2169. Would the Buddha Wear a Walkman?
    Wild Information 2024-02-02T18:38:01+00:00
  2170. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2024-02-04T15:44:51+00:00
  2171. I haven’t posted anything on here in years, but I thought it’d be funny to just drop this lil Claude…
    Zac Gorman 2024-02-05T17:20:30+00:00
  2172. Why it's impossible to agree on what's allowed
    Dan Luu 2024-02-07T00:00:00+00:00
  2173. Why those "training data poisoning" gimmicks don't really work
    Apperceptive by Sam 2024-02-09T13:42:58+00:00
  2174. Adventures in Map Zooming, Part 5: QoL
    Grid Sage Games 2024-02-11T08:30:03+00:00
  2175. Game Font Forensics
    int10h.org - VileR's blog 2024-02-11T20:26:57+00:00
  2176. 0.31 Tournament Results
    Dungeon Crawl Stone Soup 2024-02-12T05:02:19+00:00
  2177. Diseconomies of scale in fraud, spam, support, and moderation
    Dan Luu 2024-02-18T00:00:00+00:00
  2178. Brighter Than a Cloud
    Wild Information 2024-02-18T17:21:57+00:00
  2179. Full UI Upscaling, Part 5: Completion and Demos
    Grid Sage Games 2024-02-23T03:25:55+00:00
  2180. 50: Peter Taoussanis
    The REPL 2024-02-27T08:00:00+00:00
  2181. 51: Building a text editor with Nate Hunzaker
    The REPL 2024-03-05T01:00:00+00:00
  2182. Tiger Unlimited
    Fujichia 2024-03-09T22:15:24+00:00
  2183. How web bloat impacts users with slow devices
    Dan Luu 2024-03-16T00:00:00+00:00
  2184. Steal These Surface Duo Ideas
    ignorethecode.net 2024-03-16T10:32:53+00:00
  2185. When Animals Dream
    Wild Information 2024-03-17T15:30:46+00:00
  2186. Cozy Space Survivors
    Simonschreibt. 2024-03-19T18:34:14+00:00
  2187. I.F.O. (Identified Flying Object) 81987) atari basic source...
    Source Code in TV and Films 2024-03-24T17:14:01+00:00
  2188. JIM & KERRY
    Infinite Gossip 2024-03-25T05:15:40+00:00
  2189. > 182: Do you trust me? Do I trust you?
    Laura Olin 2024-03-28T08:00:00+00:00
  2190. The hearts of the Super Nintendo
    Fabien Sanglard 2024-04-01T00:00:00+00:00
  2191. The Victorian Python Community (an Allegory)
    ntoll.org 2024-04-01T05:30:00+00:00
  2192. thunder cracks, mysterious rattling sounds ⊟
    Tiny Cartridge 3DS 2024-04-04T14:30:03+00:00
  2193. CDI is Now Official and CKO is Going Offline
    Dungeon Crawl Stone Soup 2024-04-07T01:05:34+00:00
  2194. The evolution of the Super Nintendo motherboard
    Fabien Sanglard 2024-04-08T00:00:00+00:00
  2195. I DISCOVERED A NEW FRUIT
    Infinite Gossip 2024-04-09T06:14:44+00:00
  2196. Daylight Saving Time
    ignorethecode.net 2024-04-14T10:25:05+00:00
  2197. 52: Coding in YAML with Ingy döt Net
    The REPL 2024-04-14T19:00:00+00:00
  2198. Update Preview: Blood in the Water
    Barotrauma 2024-04-19T15:19:06+00:00
  2199. Dataflow Analyses and Compiler Optimizations that Use Them, for Free
    Embedded in Academia 2024-04-20T21:55:33+00:00
  2200. Inside the Super Nintendo cartridges
    Fabien Sanglard 2024-04-21T00:00:00+00:00
  2201. 53: Clojure LSP with Eric Dallo
    The REPL 2024-04-21T20:33:31+00:00
  2202. Sprouts 2024 Talks
    Combinatorial Game Theory 2024-04-23T20:29:00+00:00
  2203. The Pacification of War
    Lefineder’s Substack 2024-04-23T22:34:28+00:00
  2204. Live Lone and Prosper
    Lefineder’s Substack 2024-04-26T18:30:59+00:00
  2205. A Forest from the Moon
    Wild Information 2024-04-28T17:09:34+00:00
  2206. > 183: He stole forsythia.
    Laura Olin 2024-05-02T12:00:00+00:00
  2207. paste.winny.tech (Sillypaste) is dead
    Winny's Blog 2024-05-04T05:00:00+00:00
  2208. Pair Your Compilers At The ABI Café - Faultlore
    Faultlore 2024-05-05T00:00:00+00:00
  2209. LET ME TELL YOU ABOUT MY SAAB VIII
    Infinite Gossip 2024-05-07T03:07:31+00:00
  2210. Hyperlink Island
    Wild Information 2024-05-12T14:00:59+00:00
  2211. What Waymo's NHTSA investigation says about how far along autonomous cars are
    Apperceptive by Sam 2024-05-15T15:45:20+00:00
  2212. Neovim 0.10
    Neovim 2024-05-16T00:00:00+00:00
  2213. New Guy Alert
    Fujichia 2024-05-16T17:20:07+00:00
  2214. THE LOCAL GHOSTS
    Infinite Gossip 2024-05-20T22:45:07+00:00
  2215. The Lunacy of Artemis
    Idle Words 2024-05-24T10:12:00+00:00
  2216. What the FTC got wrong in the Google antitrust investigation
    Dan Luu 2024-05-26T00:00:00+00:00
  2217. HOW DO WE KILL CHILDREN
    Infinite Gossip 2024-05-28T07:14:50+00:00
  2218. > 184: We love what we have, no matter how little
    Laura Olin 2024-05-30T12:00:00+00:00
  2219. Supervision and truth
    Apperceptive by Sam 2024-05-31T16:23:09+00:00
  2220. On Paying Attention
    ntoll.org 2024-06-01T12:00:00+00:00
  2221. Pleasant Realms
    Fujichia 2024-06-06T14:22:46+00:00
  2222. Preview: Summer Update 2024
    Barotrauma 2024-06-07T15:36:34+00:00
  2223. The Inner Space Race
    Wild Information 2024-06-09T15:12:24+00:00
  2224. Other Worlds Zine Fair - Marrickville 23/6
    Infinite Gossip 2024-06-14T03:56:05+00:00
  2225. A discussion of discussions on AI bias
    Dan Luu 2024-06-16T00:00:00+00:00
  2226. > 185: Run them through butter
    Laura Olin 2024-06-27T12:00:00+00:00
  2227. Revisiting Number Theory and the Impossible Puzzle
    a blog by biggiemac42 2024-06-29T06:52:02+00:00
  2228. The Queen's Doll's House
    Wild Information 2024-07-02T19:23:42+00:00
  2229. TOI-700
    Infinite Gossip 2024-07-05T04:20:04+00:00
  2230. Institute for Controlled Speleogenesis
    BLDGBLOG 2024-07-08T02:18:04+00:00
  2231. Fireside Chat: Founders Inc
    Vjeux 2024-07-14T02:42:08+00:00
  2232. Podcast: Software Engineering Daily
    Vjeux 2024-07-14T02:45:11+00:00
  2233. Podcast: devtoolsFM
    Vjeux 2024-07-14T02:46:25+00:00
  2234. Podcast: Coder pour changer une vie
    Vjeux 2024-07-14T02:47:53+00:00
  2235. Podcast: Changelog
    Vjeux 2024-07-14T02:49:30+00:00
  2236. Panel on Layout Performance – EdgeConf 4
    Vjeux 2024-07-14T02:52:08+00:00
  2237. React Documentary
    Vjeux 2024-07-14T02:54:19+00:00
  2238. CPUID instruction and table
    Winny's Blog 2024-07-15T05:00:00+00:00
  2239. New Crawl Servers and a Possible Server Retirement
    Dungeon Crawl Stone Soup 2024-07-20T00:57:05+00:00
  2240. Dollar Country Newsletter, July 2024
    Dollar Country Newsletter & Radio Show 2024-07-22T00:30:54+00:00
  2241. Update the NAS to 24.05
    Winny's Blog 2024-07-23T05:00:00+00:00
  2242. > 186: Synonyms haunted. Synonyms meaningful.
    Laura Olin 2024-07-25T13:17:45+00:00
  2243. Episode 248 & 249: Ol' Bertha Needs A Big Al / I'm Out On The Town
    Dollar Country Newsletter & Radio Show 2024-07-26T20:46:45+00:00
  2244. Carving the Super Nintendo Video System
    Fabien Sanglard 2024-07-29T00:00:00+00:00
  2245. ONE EYE
    Infinite Gossip 2024-07-30T06:58:41+00:00
  2246. Tintype of a handsome dandy with fabulous hair, c. 1860s
    dead gorgeous 2024-07-31T00:39:49+00:00
  2247. The Colony Makes The World
    Wild Information 2024-08-04T20:01:11+00:00
  2248. Daguerreotype of a tough guy missing an eye, half a pinky—and, perhaps, the subjects of the group…
    dead gorgeous 2024-08-04T23:00:57+00:00
  2249. Daguerreotype of a stylish young swell possessed of the fine features and lofty brow that bring all…
    dead gorgeous 2024-08-05T02:23:36+00:00
  2250. Carte de visite of strapping Swedish naval officer Jarl Christiersson, c. 1860
    dead gorgeous 2024-08-05T16:04:38+00:00
  2251. Name the Non-Standard PC Code Page
    int10h.org - VileR's blog 2024-08-07T08:43:47+00:00
  2252. The Curious Case of Col's Computational Complexity
    Combinatorial Game Theory 2024-08-07T17:25:00+00:00
  2253. Gijs Gieskes "Zonnepanneel 2"
    Pleasant Realms 2024-08-07T21:32:44+00:00
  2254. SNES: Sprites and backgrounds rendering
    Fabien Sanglard 2024-08-09T00:00:00+00:00
  2255. How the SNES Graphics System works
    Fabien Sanglard 2024-08-09T00:00:00+00:00
  2256. How good can you be at Codenames without knowing any words?
    Dan Luu 2024-08-11T00:00:00+00:00
  2257. Quote-unquote "macros"
    Ian Henry 2024-08-12T00:00:00+00:00
  2258. Kushkuli Box Competition
    Pleasant Realms 2024-08-13T13:31:20+00:00
  2259. MY CONTACT AT THE RAT FACTORY
    Infinite Gossip 2024-08-15T05:06:54+00:00
  2260. Watching sunsets
    Fabien Sanglard 2024-08-18T00:00:00+00:00
  2261. Bake Notes 2024.08.17
    Maybe Pizza? 2024-08-18T18:38:26+00:00
  2262. 0.32 Release and Tournament
    Dungeon Crawl Stone Soup 2024-08-19T03:34:02+00:00
  2263. Haunted Mansion Lights On
    Pleasant Realms 2024-08-20T13:06:02+00:00
  2264. Magic, Modified
    Demon 2024-08-21T17:08:02+00:00
  2265. 2024 Minnesota State Fair - Falcon Heights, MN
    You Care What We Think 2024-08-23T00:38:00+00:00
  2266. Late summer greetings
    Barotrauma 2024-08-23T14:21:09+00:00
  2267. Dollar Country Newsletter, August 2024
    Dollar Country Newsletter & Radio Show 2024-08-25T12:00:28+00:00
  2268. Carte de visite of three British soldiers on beer break, c. 1860s
    dead gorgeous 2024-08-27T16:03:51+00:00
  2269. Absolutely stunning post-mortem daguerreotype of a young man with killer cheekbones and haunting…
    dead gorgeous 2024-08-27T22:01:53+00:00
  2270. 2024 Minnesota State Fair (Take 2) – Falcon Heights, MN
    You Care What We Think 2024-08-28T14:13:00+00:00
  2271. Ambrotype of a jovial gent who won’t allow an injured arm to cramp his style, c. 1860s
    dead gorgeous 2024-08-28T16:02:47+00:00
  2272. CHLOE, 21 FROM STOCKPORT
    Infinite Gossip 2024-08-28T23:41:30+00:00
  2273. Beethoven “Moonlight Sonata” for Old Elephant
    Pleasant Realms 2024-08-29T18:33:11+00:00
  2274. 0.32 “Gods and Makers”
    Dungeon Crawl Stone Soup 2024-08-29T22:20:16+00:00
  2275. Ambrotype of two boxers about to engage, c. 1850s
    dead gorgeous 2024-08-30T06:28:41+00:00
  2276. Ambrotype of a pipe-puffing pair of comrades in arms, c. 1861-65
    dead gorgeous 2024-08-30T21:42:21+00:00
  2277. Postcard of a very refined young man reading a letter with the precise degree of drama appropriate…
    dead gorgeous 2024-08-31T13:24:21+00:00
  2278. Cabinet card of “BEAUTY,” the Male Chick-Rearing Cat, 1889. As the back of the card enthuses:
    dead gorgeous 2024-08-31T18:02:07+00:00
  2279. I hope you use ShellCheck
    Winny's Blog 2024-09-01T05:00:00+00:00
  2280. Northbound Smokehouse & Brewpub - Minneapolis, MN
    You Care What We Think 2024-09-01T16:30:00+00:00
  2281. Stereoview of a hussar and his sweetheart, or perhaps ex-sweetheart, c. 1850s
    dead gorgeous 2024-09-01T23:21:26+00:00
  2282. Multitile Actors, Revisited
    Grid Sage Games 2024-09-03T08:00:05+00:00
  2283. Stereoscopic daguerreotype of two men playing chess in front of a mirror, c. 1840s
    dead gorgeous 2024-09-03T17:34:03+00:00
  2284. Daguerreotype of a gentleman with hard, hawkish eyes and a prim little kitten bow, c. 1840s
    dead gorgeous 2024-09-04T17:37:57+00:00
  2285. Looking for Missed Alarm Bugs in a Formal Verification Tool
    Embedded in Academia 2024-09-04T18:29:03+00:00
  2286. > 187: Colours dull with injustice etc.,
    Laura Olin 2024-09-05T13:47:52+00:00
  2287. How To Turn A Sphere Inside Out
    Pleasant Realms 2024-09-05T15:43:11+00:00
  2288. Daguerreotype of a gentleman with an artfully tied blue silk cravat, c. 1840s
    dead gorgeous 2024-09-05T19:10:40+00:00
  2289. How Long Does a Grain Revolution Take?
    Maybe Pizza? 2024-09-06T21:47:14+00:00
  2290. Clyde's Drive-In - Manistique, MI
    You Care What We Think 2024-09-06T22:02:00+00:00
  2291. Shellcheck and Emacs
    Winny's Blog 2024-09-08T05:00:00+00:00
  2292. Detail from a cabinet card of two officers with their arms entwined, their swords well-hung, and…
    dead gorgeous 2024-09-08T18:29:32+00:00
  2293. Pizza Roundup 2024.09
    Maybe Pizza? 2024-09-09T16:32:19+00:00
  2294. Carte de visite of a serenely self-assured young naval officer identified on reverse as Octave…
    dead gorgeous 2024-09-10T01:10:55+00:00
  2295. THE BALLAD OF THE HOLLYWOOD CASTING DIRECTOR WHO SELECTS THE FIRST MANNED MISSION TO MARS
    Infinite Gossip 2024-09-10T23:10:31+00:00
  2296. Sawmill Pizza and Brew Shed - Clear Lake, WI
    You Care What We Think 2024-09-12T16:00:00+00:00
  2297. Everything Is Everything, by Koki Tanaka
    Pleasant Realms 2024-09-12T16:29:26+00:00
  2298. I Broke It
    nklein software 2024-09-13T15:46:01+00:00
  2299. Carte de visite of a richly attired Hungarian aristocrat resplendent in fur-trimmed cape and hessian…
    dead gorgeous 2024-09-13T23:51:47+00:00
  2300. From a recovering former Python community member
    ntoll.org 2024-09-16T17:00:00+00:00
  2301. 54: JRuby with Charles Oliver Nutter
    The REPL 2024-09-17T01:00:00+00:00
  2302. 🦀 Four Thousand Weeks in Rust
    Nathan Youngman 2024-09-18T00:00:00+00:00
  2303. Round Man Brewing Company - Spooner, WI
    You Care What We Think 2024-09-18T16:00:00+00:00
  2304. Sneak peek: Alien ruin and husk improvements
    Barotrauma 2024-09-20T15:14:52+00:00
  2305. Tom Vincent's Vincenzo's Pizzeria
    Maybe Pizza? 2024-09-20T21:59:22+00:00
  2306. Reaching Long-Awaited Perfection on Opus Magnum’s Final Level
    a blog by biggiemac42 2024-09-22T06:54:11+00:00
  2307. Being a Tech Art Detective
    Simonschreibt. 2024-09-23T13:09:47+00:00
  2308. Laka Lono Rum Club - Omaha, NE
    You Care What We Think 2024-09-24T17:00:00+00:00
  2309. 0.32 Tournament Results
    Dungeon Crawl Stone Soup 2024-09-25T02:32:37+00:00
  2310. I BUILT A TIME MACHINE
    Infinite Gossip 2024-09-26T04:49:40+00:00
  2311. Separating Litharge at Top Speed
    a blog by biggiemac42 2024-09-26T08:35:08+00:00
  2312. Steve Reich's "Clapping Music" on the el green line
    Pleasant Realms 2024-09-26T21:25:28+00:00
  2313. Ray Tracing In One Weekend (in Lisp, and n-dimenions)
    nklein software 2024-09-27T02:37:31+00:00
  2314. Bake Notes 2024.09.28: Oven Experiments With Bread
    Maybe Pizza? 2024-09-29T00:24:44+00:00
  2315. Bake Notes 2024.09.29: Oven Experiments With Bread Part 2
    Maybe Pizza? 2024-09-29T17:18:09+00:00
  2316. What's a Brain?
    Wild Information 2024-09-29T21:08:05+00:00
  2317. The Rise of Kamikaze: Why Japan Turned to Suicide Attacks in WWII
    Steelsnowflake 2024-09-30T09:04:46+00:00
  2318. Bake Notes 2024.09.30: Oven Experiments With Bread Part 3
    Maybe Pizza? 2024-09-30T22:00:15+00:00
  2319. Ponder This Challenge - October 2024 - Splitting a number
    IBM Ponder This 2024-10-01T00:00:00+00:00
  2320. Carte de visite of two fine fellows rocking like it’s 1892 (per date on reverse), with their little…
    dead gorgeous 2024-10-01T03:36:13+00:00
  2321. Daguerreotype of a pair of bow-tied beaus, c. 1840s
    dead gorgeous 2024-10-02T03:12:11+00:00
  2322. Bake Notes 2024.10.02: Oven Experiments With Bread Part 4
    Maybe Pizza? 2024-10-03T04:11:33+00:00
  2323. > 188: safe through the generous fields
    Laura Olin 2024-10-03T12:52:52+00:00
  2324. Sneak peek: PvP Overhaul
    Barotrauma 2024-10-04T13:46:22+00:00
  2325. Bake Notes 2024.10.05: Oven Experiments With Bread Part 5 and 6
    Maybe Pizza? 2024-10-05T18:27:45+00:00
  2326. Missing IBM PC Localization Disks & ROMs
    int10h.org - VileR's blog 2024-10-06T20:59:38+00:00
  2327. Bake Notes 2024.10.07: Oven Experiments With Bread Part 7 - Success
    Maybe Pizza? 2024-10-08T04:20:36+00:00
  2328. Bake Notes 2024.10.08: Oven Experiments With Bread Part 8
    Maybe Pizza? 2024-10-08T21:00:44+00:00
  2329. Celebrating New Achievements in NES Tetris
    a blog by biggiemac42 2024-10-09T07:01:44+00:00
  2330. Isopod Terrarium
    Pleasant Realms 2024-10-11T16:19:45+00:00
  2331. Bake Notes 2024.10.10: Oven Experiments With Bread Part 9
    Maybe Pizza? 2024-10-11T20:09:38+00:00
  2332. Dollar Country Newsletter, October 2024
    Dollar Country Newsletter & Radio Show 2024-10-13T12:03:19+00:00
  2333. The Hidden Bird Algorithm
    Wild Information 2024-10-13T15:01:56+00:00
  2334. Pizza / Bread Roundup 002
    Maybe Pizza? 2024-10-13T19:51:22+00:00
  2335. Bake Notes 2024.10.14 Oven Experiments With Bread Part 10
    Maybe Pizza? 2024-10-14T22:22:56+00:00
  2336. Wonderful Toolchain project update - October 2024
    Posts on asie's blog 2024-10-17T00:00:00+00:00
  2337. BAR
    Infinite Gossip 2024-10-18T02:26:55+00:00
  2338. 55: Instant: a modern Firebase in Clojure, with Stepan Parunashvili
    The REPL 2024-10-18T06:00:32+00:00
  2339. Coming next week: Unto the Breach update
    Barotrauma 2024-10-18T15:57:40+00:00
  2340. Sade Parking Lot
    Pleasant Realms 2024-10-18T16:43:29+00:00
  2341. Cattle Grazing Is Not the Answer to Climate Change
    Steelsnowflake 2024-10-19T11:01:40+00:00
  2342. The Empyrean’s New Clothes
    Demon 2024-10-19T19:12:01+00:00
  2343. Bake Notes 2024.10.21: Oven Experiments With Bread Part 13
    Maybe Pizza? 2024-10-21T20:37:35+00:00
  2344. Rainbow Gray
    Steelsnowflake 2024-10-24T15:01:54+00:00
  2345. Steve Ballmer was an underrated CEO
    Dan Luu 2024-10-28T00:00:00+00:00
  2346. Chilean Sea Bass
    The Curiosity Cabinet 2024-10-28T02:37:03+00:00
  2347. Bach To The Future - Toccata and Fugue in D minor
    Pleasant Realms 2024-10-29T16:23:21+00:00
  2348. THE WAR CRIMINALS AFTER THE WAR
    Infinite Gossip 2024-10-30T21:43:38+00:00
  2349. Ponder This Challenge - November 2024 - Tetrahedron volumes
    IBM Ponder This 2024-11-01T00:00:00+00:00
  2350. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2024-11-03T22:54:21+00:00
  2351. The Secret Ballot
    The Curiosity Cabinet 2024-11-05T17:08:25+00:00
  2352. The Paradox of Progress: Mencken on Democracy
    Steelsnowflake 2024-11-06T19:29:39+00:00
  2353. > 189: AIN' EVEN BEEN PLANTED YET
    Laura Olin 2024-11-07T14:04:47+00:00
  2354. untitled
    Terrible Banana 2024-11-11T01:31:08+00:00
  2355. A software controlled power supply for $25
    The Grymoire 2024-11-11T18:31:50+00:00
  2356. Dollar Country Newsletter, November 2024
    Dollar Country Newsletter & Radio Show 2024-11-11T23:56:51+00:00
  2357. Midwestern Luxury = Black Walnut Cake
    Midwesterner 2024-11-13T13:02:59+00:00
  2358. THE SWIMMERS
    Infinite Gossip 2024-11-13T21:38:36+00:00
  2359. Submarine Highlights (live right now!!!!)
    Pleasant Realms 2024-11-14T16:24:21+00:00
  2360. Regarding the future of BlocksDS
    Posts on asie's blog 2024-11-15T17:30:00+00:00
  2361. Rivers of Blood
    Demon 2024-11-16T05:21:24+00:00
  2362. Episodes 250 & 251: Ten Mile Zone / It's Not Much But It's Home
    Dollar Country Newsletter & Radio Show 2024-11-17T14:02:16+00:00
  2363. Sitters and Standers
    The Pudding 2024-11-19T06:00:00+00:00
  2364. Hanky Pankies for the Holidays
    Midwesterner 2024-11-20T13:00:49+00:00
  2365. Pizza Roundup 003
    Maybe Pizza? 2024-11-20T17:40:12+00:00
  2366. crowfunding: 868-BACK
    Mighty Vision 2024-11-21T14:49:00+00:00
  2367. crowfunding: 868-BACK
    Mighty Vision 2024-11-21T14:49:00+00:00
  2368. untitled
    Terrible Banana 2024-11-25T04:00:31+00:00
  2369. Perfect imperfection
    medievalbooks 2024-11-25T11:15:34+00:00
  2370. Beastly beginnings
    medievalbooks 2024-11-26T10:45:30+00:00
  2371. CROW FUN
    Mighty Vision 2024-11-28T12:54:00+00:00
  2372. CROW FUN
    Mighty Vision 2024-11-28T12:54:00+00:00
  2373. A COLONY OF MAGGOTS BUILDING A FOX THROUGH CAREFUL EFFORT
    Infinite Gossip 2024-11-28T21:25:28+00:00
  2374. Green Bean Casserole
    The Curiosity Cabinet 2024-11-29T01:20:43+00:00
  2375. Ray Tracing Extra-dimensional CSG Objects
    nklein software 2024-11-30T15:43:17+00:00
  2376. Ponder This Challenge - December 2024 - Counting numbers with specific digits
    IBM Ponder This 2024-12-01T00:00:00+00:00
  2377. Train Driver Record Hanoi to Ninh Bahn
    Pleasant Realms 2024-12-02T14:11:03+00:00
  2378. The Great Filter Comes For Us All
    Coding Horror 2024-12-02T18:25:46+00:00
  2379. The 2024 Midwesterner Gift Guide
    Midwesterner 2024-12-05T13:02:59+00:00
  2380. Holiday greetings and update preview
    Barotrauma 2024-12-05T14:34:30+00:00
  2381. > 190: What are you trying to be free of?
    Laura Olin 2024-12-05T16:49:45+00:00
  2382. archive - patreon "about"
    Mighty Vision 2024-12-08T21:37:00+00:00
  2383. archive - patreon "about"
    Mighty Vision 2024-12-08T21:37:00+00:00
  2384. Advent of Code 2024
    Winny's Blog 2024-12-09T06:00:00+00:00
  2385. If the PO-33 K.O. was an OP-1
    Spongefile 2024-12-17T11:39:23+00:00
  2386. Gym motivator sheet
    Spongefile 2024-12-19T08:47:46+00:00
  2387. Year 11 of the Cogmind
    Grid Sage Games 2024-12-20T02:15:27+00:00
  2388. LET ME TELL YOU ABOUT MY SAAB IX
    Infinite Gossip 2024-12-23T23:01:45+00:00
  2389. Christmas Creep
    The Curiosity Cabinet 2024-12-26T11:40:40+00:00
  2390. Come and See
    Steelsnowflake 2024-12-26T13:41:23+00:00
  2391. Panettone, Taste Of Italy
    Pleasant Realms 2024-12-27T14:33:26+00:00
  2392. Jimmy Carter's UFO
    The Curiosity Cabinet 2024-12-30T04:54:41+00:00
  2393. Ponder This Challenge - January 2025 - The irrational three-jug problem
    IBM Ponder This 2025-01-01T00:00:00+00:00
  2394. > 191: Under the new weight of the sun
    Laura Olin 2025-01-02T16:18:09+00:00
  2395. Lost Obelisks
    Demon 2025-01-05T17:10:53+00:00
  2396. Browser Bits
    Winny's Blog 2025-01-07T06:00:00+00:00
  2397. Stay Gold, America
    Coding Horror 2025-01-07T07:42:04+00:00
  2398. Black Coffee And Shimza Type Of Effects
    Pleasant Realms 2025-01-09T16:26:29+00:00
  2399. Building Bauble
    Ian Henry 2025-01-10T00:00:00+00:00
  2400. DAVID AND HIS BROTHERS
    Infinite Gossip 2025-01-10T01:04:38+00:00
  2401. from Catching the Big Fish by David Lynch
    .mattfraction 2025-01-16T21:10:32+00:00
  2402. January Blues (a personal update)
    Steelsnowflake 2025-01-17T12:12:55+00:00
  2403. Finding 94123 Solutions to a Math Problem
    a blog by biggiemac42 2025-01-19T21:53:00+00:00
  2404. making choices on server map - part 1
    Mighty Vision 2025-01-23T22:52:00+00:00
  2405. making choices on server map - part 1
    Mighty Vision 2025-01-23T22:52:00+00:00
  2406. My friend Michael
    ntoll.org 2025-01-25T17:45:00+00:00
  2407. Dogecoin: A Series – Part 1
    The Curiosity Cabinet 2025-01-26T15:01:54+00:00
  2408. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2025-01-27T21:01:43+00:00
  2409. 21 best circular saw hacks
    Pleasant Realms 2025-01-28T15:45:56+00:00
  2410. When the Sackler Brothers studied LSD
    Res Obscura 2025-01-29T14:46:53+00:00
  2411. State Of The (Dollar) Country 2025
    Dollar Country Newsletter & Radio Show 2025-01-29T20:32:21+00:00
  2412. making choices on server map - part 2
    Mighty Vision 2025-01-30T23:00:00+00:00
  2413. making choices on server map - part 2
    Mighty Vision 2025-01-30T23:00:00+00:00
  2414. Dollar Country Newsletter, January 2025
    Dollar Country Newsletter & Radio Show 2025-01-31T14:03:33+00:00
  2415. Welcome to Europa, 2025
    Barotrauma 2025-01-31T15:22:38+00:00
  2416. CGTC 5, Day 0 Talk
    Combinatorial Game Theory 2025-01-31T22:13:00+00:00
  2417. CGTC 5, Day One Talks
    Combinatorial Game Theory 2025-01-31T22:55:00+00:00
  2418. Ponder This Challenge - February 2025 - Prime number magic square
    IBM Ponder This 2025-02-01T00:00:00+00:00
  2419. Pocket Operator sync modes explained
    Spongefile 2025-02-01T00:59:11+00:00
  2420. Dogecoin: The Founders – Part 2
    The Curiosity Cabinet 2025-02-02T14:25:58+00:00
  2421. CGTC 5, Day Two Talks
    Combinatorial Game Theory 2025-02-02T17:36:00+00:00
  2422. CGTC 5, Day Three Talks
    Combinatorial Game Theory 2025-02-03T00:02:00+00:00
  2423. The Alien Tome
    EXO 2025-02-05T01:07:42+00:00
  2424. The familiar loneliness of the Kinetoscope
    Res Obscura 2025-02-05T14:08:51+00:00
  2425. > 192: I will constitute the field
    Laura Olin 2025-02-06T16:27:48+00:00
  2426. Pizza Roundup 004
    Maybe Pizza? 2025-02-06T20:25:40+00:00
  2427. Cyberpunk: Broken Edges
    Simonschreibt. 2025-02-10T10:39:11+00:00
  2428. Dogecoin: The Tippers – Part 3
    The Curiosity Cabinet 2025-02-10T13:36:10+00:00
  2429. A THIRD DISTANCE
    Infinite Gossip 2025-02-12T00:29:34+00:00
  2430. Anno 1800: Shadows of Beauty
    Simonschreibt. 2025-02-12T21:33:09+00:00
  2431. Happy Lupercalia
    Res Obscura 2025-02-13T19:12:44+00:00
  2432. Introducing MechA, the Opus Magnum Metric of Your Nightmares
    a blog by biggiemac42 2025-02-16T16:53:46+00:00
  2433. My 2024 in Review
    Winny's Blog 2025-02-18T06:00:00+00:00
  2434. The Shape of a Mars Mission
    Idle Words 2025-02-19T23:36:00+00:00
  2435. Apocalypse Without End: D.H. Lawrence on Revelation
    Steelsnowflake 2025-02-20T12:17:13+00:00
  2436. Infinity Nikki: One-way Window
    Simonschreibt. 2025-02-22T22:52:09+00:00
  2437. Dogecoin: The Pump and Dumpers – Part 4
    The Curiosity Cabinet 2025-02-23T14:34:30+00:00
  2438. INFINITE GOSSIP IS NOW ON GHOST
    Infinite Gossip 2025-02-26T23:39:51+00:00
  2439. Ask a Midwesterner: Is Omaha Really America's Steak Capital?
    Midwesterner 2025-02-27T12:01:00+00:00
  2440. Google Summer of Code 2025
    Neovim 2025-02-28T00:00:00+00:00
  2441. Ponder This Challenge - March 2025 - Electric networks in graphs
    IBM Ponder This 2025-03-01T00:00:00+00:00
  2442. Dystopia of Decadence: Huxley’s Brave New World and Ours as Well
    Steelsnowflake 2025-03-01T12:11:39+00:00
  2443. Release the Hounds!
    Demon 2025-03-01T22:40:24+00:00
  2444. Infinity Nikki: Mysterious Shadow Drop
    Simonschreibt. 2025-03-02T19:47:47+00:00
  2445. The Middle Ages
    The Pudding 2025-03-03T06:00:00+00:00
  2446. Why fastDoom is fast
    Fabien Sanglard 2025-03-04T00:00:00+00:00
  2447. Surrendering Optimally
    a blog by biggiemac42 2025-03-04T18:25:04+00:00
  2448. AI legibility, physical archives, and the future of research
    Res Obscura 2025-03-05T18:21:34+00:00
  2449. Let's Talk About The American Dream
    Coding Horror 2025-03-06T01:27:31+00:00
  2450. Conquering the Final Cycle
    a blog by biggiemac42 2025-03-08T05:48:26+00:00
  2451. Vienna
    ntoll.org 2025-03-09T18:30:00+00:00
  2452. Onomatopoeia Odyssey
    The Pudding 2025-03-10T05:00:00+00:00
  2453. Interviewing #1 backyard composter Ahram Park
    The Rot 2025-03-10T17:55:52+00:00
  2454. The Last Barbecue Joint in Cairo
    Midwesterner 2025-03-12T11:03:24+00:00
  2455. > 193: I know now is not the time to take up flying.
    Laura Olin 2025-03-13T15:22:06+00:00
  2456. Spring update now available in the Unstable beta
    Barotrauma 2025-03-14T16:54:01+00:00
  2457. The High Heel Problem
    Simonschreibt. 2025-03-17T19:16:55+00:00
  2458. What we can learn from watching reality stars apologize
    The Pudding 2025-03-19T05:00:00+00:00
  2459. The Road Not Taken is Guaranteed Minimum Income
    Coding Horror 2025-03-20T23:33:13+00:00
  2460. Dollar Country Newsletter, March 2025
    Dollar Country Newsletter & Radio Show 2025-03-21T14:01:51+00:00
  2461. Neovim 0.11
    Neovim 2025-03-22T00:00:00+00:00
  2462. All notable upcoming Japanese RPGs (JRPGs) in 2025
    PS5 – Destructoid 2025-03-23T15:00:25+00:00
  2463. How to use Chaser Rounds in Monster Hunter Wilds
    PS5 – Destructoid 2025-03-24T12:42:22+00:00
  2464. Indiana Jones and the Great Circle PS5 release date announced
    PS5 – Destructoid 2025-03-24T15:46:53+00:00
  2465. Atomfall: the difference between the Standard and Deluxe Edition
    PS5 – Destructoid 2025-03-25T14:25:03+00:00
  2466. Where to find Seaside Cendrelis in Wuthering Waves 2.2
    PS5 – Destructoid 2025-03-27T08:44:10+00:00
  2467. Sims 4: Mirrors
    Simonschreibt. 2025-03-27T22:05:42+00:00
  2468. 3/29: Ding Dong The Bug Is Dead
    Demon 2025-03-30T02:06:47+00:00
  2469. Is Camellya still worth pulling in Wuthering Waves 2.2?
    PS5 – Destructoid 2025-03-31T17:05:34+00:00
  2470. Best Cantarella build in Wuthering Waves – Weapons, echoes, team compositions, and sequences
    PS5 – Destructoid 2025-03-31T20:14:50+00:00
  2471. You Can Compost That?!
    The Rot 2025-03-31T21:52:36+00:00
  2472. Ponder This Challenge - April 2025 - Klumpengeist
    IBM Ponder This 2025-04-01T00:00:00+00:00
  2473. The Generational Legacy of Samples in Music
    The Pudding 2025-04-01T05:00:00+00:00
  2474. Is Cantarella worth pulling in Wuthering Waves?
    PS5 – Destructoid 2025-04-01T15:15:10+00:00
  2475. When Jorge Luis Borges met one of the founders of AI
    Res Obscura 2025-04-02T17:27:24+00:00
  2476. Best support builds for Iansan in Genshin Impact
    PS5 – Destructoid 2025-04-03T17:52:00+00:00
  2477. Coming next week: Calm Before the Storm update
    Barotrauma 2025-04-04T16:07:16+00:00
  2478. > 194: I believe my courage will expand like a sponge cowboy in water
    Laura Olin 2025-04-10T12:58:46+00:00
  2479. Forever trapped inside a picture after kissing an eldritch being: all about Lyle from Look Outside
    Dark RPGs 2025-04-11T08:58:44+00:00
  2480. Emacs: Edit as root using sudo-edit
    Winny's Blog 2025-04-11T16:22:17+00:00
  2481. Unlock all new weapons and characters in SaGa-themed Vampire Survivors update “Emerald Diorama”
    PS5 – Destructoid 2025-04-11T20:11:24+00:00
  2482. How to quickly break the bounds of space and find love in Vampire Survivors
    PS5 – Destructoid 2025-04-12T18:20:23+00:00
  2483. FREE COMPOST, now showing
    The Rot 2025-04-12T20:39:25+00:00
  2484. Sprouts 2025 Morning Talks
    Combinatorial Game Theory 2025-04-13T02:44:00+00:00
  2485. Sprouts 2025 Keynote and Afternoon Session Summaries
    Combinatorial Game Theory 2025-04-13T14:11:00+00:00
  2486. Sprouts2025 Wrap-up
    Combinatorial Game Theory 2025-04-13T14:59:00+00:00
  2487. The Pour-igin of Species
    The Pudding 2025-04-16T05:00:00+00:00
  2488. Onfim's world
    Res Obscura 2025-04-16T13:10:51+00:00
  2489. Trunk Update and 0.33 Tournament Announcement
    Dungeon Crawl Stone Soup 2025-04-18T21:32:48+00:00
  2490. Breaking the Roman Republic: The Tragedy of Tiberius Gracchus
    Steelsnowflake 2025-04-19T12:10:53+00:00
  2491. 10PRINT inspired "Snowcrash" in Emacs
    Winny's Blog 2025-04-19T17:40:25+00:00
  2492. K.O. II EP-133 Champions update cheat sheet
    Spongefile 2025-04-19T20:58:51+00:00
  2493. Oblivion Remastered Fin Gleam Helm location
    PS5 – Destructoid 2025-04-23T21:47:23+00:00
  2494. Building a tower to reach heaven in a world that could be from a cult lost PS2 RPG: Interview with Ghrian Studio, the developer of BURGGEIST
    Dark RPGs 2025-04-24T09:26:10+00:00
  2495. How to get more Magicka in Oblivion Remastered
    PS5 – Destructoid 2025-04-25T14:55:54+00:00
  2496. Update on my Racket exit
    Winny's Blog 2025-04-25T19:47:23+00:00
  2497. On the aura of Ruth Stout & not sifting my compost
    The Rot 2025-04-25T22:13:13+00:00
  2498. Augustus Didn't Kill the Roman Republic (It Was Already Dead)
    Steelsnowflake 2025-04-28T17:50:09+00:00
  2499. 0.33 Tournament Page
    Dungeon Crawl Stone Soup 2025-04-30T01:45:08+00:00
  2500. Machine With Wishbone
    Pleasant Realms 2025-04-30T18:00:47+00:00
  2501. Thoughts on time
    Spongefile 2025-04-30T21:37:00+00:00
  2502. Ponder This Challenge - May 2025 - The prime arithmetic quiz
    IBM Ponder This 2025-05-01T00:00:00+00:00
  2503. Do rabbits eat carrots because of Clark Gable?
    Snack Stack 2025-05-01T11:53:00+00:00
  2504. Darkest Light
    Steelsnowflake 2025-05-01T12:55:36+00:00
  2505. 620 Club – St. Paul, MN
    You Care What We Think 2025-05-01T16:10:00+00:00
  2506. 0.33 “Reforge Yourself”
    Dungeon Crawl Stone Soup 2025-05-02T17:07:35+00:00
  2507. Here We Are In Eden…
    Demon 2025-05-05T03:42:40+00:00
  2508. Jellybean and Julia’s BBQ – Coon Rapids, MN
    You Care What We Think 2025-05-05T16:30:00+00:00
  2509. Where does Grand Theft Auto 6 take place?
    PS5 – Destructoid 2025-05-06T15:01:56+00:00
  2510. COMING UP WITH A COMPLETE LIST OF WAYS TO FEEL GOOD
    Infinite Gossip 2025-05-07T03:17:13+00:00
  2511. deeelite1988
    Pleasant Realms 2025-05-07T16:38:17+00:00
  2512. AI makes the humanities more important, but also a lot weirder
    Res Obscura 2025-05-07T19:24:36+00:00
  2513. Are you more likely to die on your birthday?
    The Pudding 2025-05-08T05:00:00+00:00
  2514. > 195: If I stand very still, I do no further harm.
    Laura Olin 2025-05-08T12:19:20+00:00
  2515. Lulu’s Thai Noodle Shop – Kansas City, MO
    You Care What We Think 2025-05-11T04:20:00+00:00
  2516. K.O. II EP-133 timing map
    Spongefile 2025-05-12T10:30:40+00:00
  2517. Slap’s BBQ – Kansas City, MO
    You Care What We Think 2025-05-14T16:30:00+00:00
  2518. Integers 2025 CGT Talks
    Combinatorial Game Theory 2025-05-15T02:11:00+00:00
  2519. Guy Whipping A Massive Chain
    Pleasant Realms 2025-05-15T15:19:56+00:00
  2520. An update on soil testing and LA after the fires.
    The Rot 2025-05-16T17:03:22+00:00
  2521. Building my childhood dream PC
    Fabien Sanglard 2025-05-18T00:00:00+00:00
  2522. Café Corazon – Kansas City, MO
    You Care What We Think 2025-05-18T16:30:00+00:00
  2523. Building Number Factories in Beltmatic
    a blog by biggiemac42 2025-05-19T05:31:24+00:00
  2524. Why were Belle Époque cities beautiful?
    Res Obscura 2025-05-21T19:37:37+00:00
  2525. St Louis Skills on Wheels 2025
    Pleasant Realms 2025-05-23T15:28:29+00:00
  2526. 0.33 Tournament Results
    Dungeon Crawl Stone Soup 2025-05-25T02:40:47+00:00
  2527. THE DEAL I STRUCK WITH BURGER KING
    Infinite Gossip 2025-05-26T02:50:35+00:00
  2528. Access Control Syntax
    journal.stuffwithstuff.com 2025-05-26T07:00:00+00:00
  2529. Hot Hands Pie & Biscuit – St. Paul, MN
    You Care What We Think 2025-05-27T21:07:00+00:00
  2530. Asian Misrepresentation
    The Pudding 2025-05-28T05:00:00+00:00
  2531. TP-7 guide: going deeper
    Spongefile 2025-05-29T16:04:05+00:00
  2532. 868-BACK trailer: UNIFIED
    Mighty Vision 2025-05-29T22:10:00+00:00
  2533. 868-BACK trailer: UNIFIED
    Mighty Vision 2025-05-29T22:10:00+00:00
  2534. Consider Knitting
    journal.stuffwithstuff.com 2025-05-30T07:00:00+00:00
  2535. 🕸️ 28 Years of Web Development
    Nathan Youngman 2025-05-31T00:00:00+00:00
  2536. An Aesthetic Approach
    ntoll.org 2025-05-31T10:30:00+00:00
  2537. The symbolism behind a grieving family: Analysis of the Axons of Clair Obscur Expedition 33
    Dark RPGs 2025-05-31T14:33:18+00:00
  2538. Ponder This Challenge - June 2025 - Jumping frog game
    IBM Ponder This 2025-06-01T00:00:00+00:00
  2539. Published for the first time: the Princeton INTERCAL Compiler's source code
    esoteric.codes 2025-06-01T11:27:00+00:00
  2540. LLMs are cheap
    Juho Snellman's Weblog 2025-06-02T22:00:00+00:00
  2541. Apps in the late stage gold rush
    Spongefile 2025-06-03T10:45:21+00:00
  2542. The contested cracker from Southeast Asia
    Snack Stack 2025-06-03T11:47:00+00:00
  2543. Sticker print run signup
    Spongefile 2025-06-04T10:03:16+00:00
  2544. Every announcement from PlayStation State of Play – June 2025
    PS5 – Destructoid 2025-06-04T22:02:20+00:00
  2545. The Loneliness Epidemic, in Data
    The Pudding 2025-06-05T05:00:00+00:00
  2546. 0.33.1 Bugfix Release
    Dungeon Crawl Stone Soup 2025-06-06T23:59:01+00:00
  2547. On how to compost grass, a quickie edition
    The Rot 2025-06-09T18:40:26+00:00
  2548. Stop Uploading Your Data to Google
    ignorethecode.net 2025-06-11T20:00:07+00:00
  2549. 30 Minutes with a Stranger
    The Pudding 2025-06-12T05:00:00+00:00
  2550. > 196: Remember this
    Laura Olin 2025-06-12T16:58:10+00:00
  2551. 💭 Career Break: What Will I Do
    Nathan Youngman 2025-06-13T00:00:00+00:00
  2552. 💸 Career Break: How I Got Here
    Nathan Youngman 2025-06-13T00:00:00+00:00
  2553. 56: XTDB: A Bitemporal database in Clojure
    The REPL 2025-06-13T08:23:56+00:00
  2554. PS5’s Wolverine game finally just made an appearance after several years of silence, and it’s not dead after all
    PS5 – Destructoid 2025-06-13T16:32:47+00:00
  2555. Sony says PS5 is now more profitable than any PlayStation console before it, and PS6 is on the way
    PS5 – Destructoid 2025-06-13T16:53:35+00:00
  2556. Wonderful Toolchain project update - June 2025
    Posts on asie's blog 2025-06-15T00:00:00+00:00
  2557. DESERT MOVING, INC.
    Infinite Gossip 2025-06-18T06:26:11+00:00
  2558. Our waste infrastructure lags behind the products we manufacture
    The Rot 2025-06-26T17:52:11+00:00
  2559. Video Game Thoughts Bonus Bag #6
    The Bottom Feeder 2025-06-26T18:01:07+00:00
  2560. Enter the Meadow
    Wild Information 2025-06-29T14:01:27+00:00
  2561. Ponder This Challenge - July 2025 - Swallows on a Wire
    IBM Ponder This 2025-07-01T00:00:00+00:00
  2562. Childhood California Fans
    Pleasant Realms 2025-07-02T13:04:28+00:00
  2563. Is Helldivers 2 on Xbox?
    PS5 – Destructoid 2025-07-06T18:45:00+00:00
  2564. Bears Will Be Boys
    The Pudding 2025-07-07T05:00:00+00:00
  2565. Uncontrolled Remains
    BLDGBLOG 2025-07-07T20:41:30+00:00
  2566. Architectural Dressage
    BLDGBLOG 2025-07-08T16:03:43+00:00
  2567. TRICK SHOT
    Infinite Gossip 2025-07-10T07:33:39+00:00
  2568. Dollar Country Episode 252: Beer Money
    Dollar Country Newsletter & Radio Show 2025-07-10T14:45:19+00:00
  2569. Setting Up an SDL3 Mac App in XCode 16
    journal.stuffwithstuff.com 2025-07-13T07:00:00+00:00
  2570. Dollar Country Episode 253: Lonesome Crazy
    Dollar Country Newsletter & Radio Show 2025-07-13T12:02:48+00:00
  2571. 🤖 Sudo Make Me A Triangle
    Nathan Youngman 2025-07-15T00:00:00+00:00
  2572. Dollar Country Episode 254: Music From The Great Plains
    Dollar Country Newsletter & Radio Show 2025-07-17T14:03:07+00:00
  2573. > 197: I knew so much and sang anyway
    Laura Olin 2025-07-17T14:09:08+00:00
  2574. geo/acc
    BLDGBLOG 2025-07-19T17:33:47+00:00
  2575. A New Economy
    Demon 2025-07-20T16:54:25+00:00
  2576. Seer
    BLDGBLOG 2025-07-20T21:20:52+00:00
  2577. Wallace Stevens and the Poetry We No Longer Write
    Steelsnowflake 2025-07-22T18:51:55+00:00
  2578. NYC's Urban Textscape
    The Pudding 2025-07-24T05:00:00+00:00
  2579. Dollar Country Episode 255: Makin' Steel
    Dollar Country Newsletter & Radio Show 2025-07-24T14:03:00+00:00
  2580. Those Secret Fonts from the ISA-16 PS/2 Models (Again)
    int10h.org - VileR's blog 2025-07-26T15:56:43+00:00
  2581. When Cats and a Cat God help you escape from a SCP-like facility in the dark JRPG Break Wolf [Mechanic]
    Dark RPGs 2025-07-28T09:48:03+00:00
  2582. Coming Soon ... Avernum 4: Greed and Glory!
    The Bottom Feeder 2025-07-30T19:00:26+00:00
  2583. OpenAI's "Study Mode" and the risks of flattery
    Res Obscura 2025-07-31T13:32:16+00:00
  2584. Ponder This Challenge - August 2025 - A grid-cutting game
    IBM Ponder This 2025-08-01T00:00:00+00:00
  2585. Mr. Mustacheo – West St. Paul, MN
    You Care What We Think 2025-08-02T03:36:00+00:00
  2586. Eating the Engram
    Wild Information 2025-08-03T15:01:44+00:00
  2587. This one-and-done PSP masterpiece healed my aversion to tactical RPGs
    PS5 – Destructoid 2025-08-04T17:14:20+00:00
  2588. Mineral Hurricane
    BLDGBLOG 2025-08-04T21:07:15+00:00
  2589. Quadratic Number Fields
    nklein software 2025-08-05T02:46:13+00:00
  2590. Bag of words, have mercy on us
    Experimental History 2025-08-05T20:06:09+00:00
  2591. My new book: YOU MUST UNDERSTAND THIS IF YOU WANT TO LIVE
    Infinite Gossip 2025-08-06T00:50:58+00:00
  2592. 22 Northmen Brewing Company – Alexandria, MN
    You Care What We Think 2025-08-06T16:30:00+00:00
  2593. I Have No Mouth, and I Must Scream: The 30 Year Late Review
    The Bottom Feeder 2025-08-06T18:48:54+00:00
  2594. Designing for Mastery in Roguelikes (w/Roguelike Radio)
    Grid Sage Games 2025-08-07T00:47:51+00:00
  2595. Italy's undercover pizza detectives (AVPN Is a Scam)
    Maybe Pizza? 2025-08-07T20:30:01+00:00
  2596. Still A Pressing Issue
    Discworld MUD Dev Blog 2025-08-11T10:29:39+00:00
  2597. Dicing an Onion, the Mathematically Optimal Way
    The Pudding 2025-08-12T05:00:00+00:00
  2598. Making Everything Groovy
    Discworld MUD Dev Blog 2025-08-12T11:25:51+00:00
  2599. All Souls exam questions and the limits of machine reasoning
    Res Obscura 2025-08-13T20:33:27+00:00
  2600. How To Get Internet Feedback Without Going Insane
    The Bottom Feeder 2025-08-14T18:04:47+00:00
  2601. Youthful Indiscretion
    Discworld MUD Dev Blog 2025-08-19T08:56:59+00:00
  2602. PS5 joins price-hike party in US due to ‘challenging economic environment’
    PS5 – Destructoid 2025-08-20T15:40:50+00:00
  2603. 2025 Minnesota State Fair - Falcon Heights, MN
    You Care What We Think 2025-08-21T18:28:00+00:00
  2604. LET ME TELL YOU ABOUT MY SAAB X
    Infinite Gossip 2025-08-26T10:39:09+00:00
  2605. How to Make A Mushroom Soda (That Tastes Like Peach)
    Midwesterner 2025-08-27T12:03:18+00:00
  2606. > 198: The world is a laden thing
    Laura Olin 2025-08-28T14:51:26+00:00
  2607. 😎 Summer Break
    Nathan Youngman 2025-08-29T00:00:00+00:00
  2608. 🌱 My Vegan Journey
    Nathan Youngman 2025-08-31T00:00:00+00:00
  2609. Ponder This Challenge - September 2025 - Cake flip-cutting
    IBM Ponder This 2025-09-01T00:00:00+00:00
  2610. Use this magic bullet to shoot yourself in the foot
    Experimental History 2025-09-02T16:48:01+00:00
  2611. Stickers printed
    Spongefile 2025-09-03T20:36:38+00:00
  2612. I Review the New Dungeons & Dragons Art, Unwisely
    The Bottom Feeder 2025-09-04T20:19:40+00:00
  2613. untitled
    Jon Rafman 2025-09-04T21:18:43+00:00
  2614. Wonderful Toolchain project update - September 2025
    Posts on asie's blog 2025-09-05T00:00:00+00:00
  2615. untitled
    Jon Rafman 2025-09-06T14:38:10+00:00
  2616. untitled
    Jon Rafman 2025-09-07T01:20:28+00:00
  2617. untitled
    Jon Rafman 2025-09-07T15:00:35+00:00
  2618. The curious history of Chicken in a Biskit
    Snack Stack 2025-09-07T19:10:39+00:00
  2619. untitled
    Jon Rafman 2025-09-08T15:00:26+00:00
  2620. untitled
    Jon Rafman 2025-09-09T05:08:34+00:00
  2621. untitled
    Jon Rafman 2025-09-09T15:05:13+00:00
  2622. untitled
    Jon Rafman 2025-09-09T19:04:30+00:00
  2623. untitled
    Jon Rafman 2025-09-09T19:24:50+00:00
  2624. untitled
    Jon Rafman 2025-09-09T21:50:49+00:00
  2625. untitled
    Jon Rafman 2025-09-09T23:52:37+00:00
  2626. Found The Thread
    Discworld MUD Dev Blog 2025-09-10T17:47:56+00:00
  2627. untitled
    Jon Rafman 2025-09-10T21:16:07+00:00
  2628. THE MACHINE HAS TO BE WRONG
    Infinite Gossip 2025-09-11T06:52:13+00:00
  2629. untitled
    Jon Rafman 2025-09-14T14:53:12+00:00
  2630. Pizza Roundup 005
    Maybe Pizza? 2025-09-14T22:27:17+00:00
  2631. untitled
    STML 2025-09-15T09:05:41+00:00
  2632. The mousy snack for a Dutch baby
    Snack Stack 2025-09-15T19:53:15+00:00
  2633. untitled
    Jon Rafman 2025-09-16T03:28:42+00:00
  2634. untitled
    STML 2025-09-16T06:04:25+00:00
  2635. Return of CTRI Innovations
    Fujichia 2025-09-16T11:44:31+00:00
  2636. Blog Extravaganza 2025: the winners
    Experimental History 2025-09-16T14:14:58+00:00
  2637. 7 things we want to see from the next PlayStation State of Play
    PS5 – Destructoid 2025-09-16T17:53:45+00:00
  2638. untitled
    Jon Rafman 2025-09-17T01:33:09+00:00
  2639. The Avernum 4 Story, Part 2: What Went Wrong and Why
    The Bottom Feeder 2025-09-17T18:18:22+00:00
  2640. untitled
    Jon Rafman 2025-09-18T01:08:43+00:00
  2641. The moments
    Escaping Flatland 2025-09-18T08:51:32+00:00
  2642. A Blanket Solution
    Discworld MUD Dev Blog 2025-09-18T15:01:34+00:00
  2643. Star Forts, Mines, and Other Maastricht Subterranea
    BLDGBLOG 2025-09-18T17:53:53+00:00
  2644. Celestial Detector
    BLDGBLOG 2025-09-19T05:42:09+00:00
  2645. untitled
    Jon Rafman 2025-09-23T07:45:26+00:00
  2646. untitled
    Jon Rafman 2025-09-23T08:01:15+00:00
  2647. A Couple Of QoL Tweaks To Coatings
    Discworld MUD Dev Blog 2025-09-23T13:39:42+00:00
  2648. Touch me touch me
    Muppe 2025-09-23T19:57:28+00:00
  2649. I’m crotchwalking at you in the deadmans night, are you coming with me or are you coming with me, or…
    Muppe 2025-09-23T20:01:28+00:00
  2650. buuble butt
    Muppe 2025-09-23T20:01:47+00:00
  2651. REAR-ending
    Muppe 2025-09-23T20:02:23+00:00
  2652. the feigned sound of a whistle in the town during noon
    Muppe 2025-09-23T20:03:23+00:00
  2653. Sometimes you’re brave
    Muppe 2025-09-23T20:04:30+00:00
  2654. A nerving..
    Muppe 2025-09-23T20:05:20+00:00
  2655. Howl like the Wet-Nap washed you
    Muppe 2025-09-23T20:06:16+00:00
  2656. nothing big is coming
    Muppe 2025-09-23T20:10:02+00:00
  2657. What do you think it all means, Chris?
    Muppe 2025-09-23T20:24:24+00:00
  2658. Follow like piper
    Muppe 2025-09-23T20:26:16+00:00
  2659. Byung-Chul Han, Anomalisa, and the Myth of Sameness
    Steelsnowflake 2025-09-23T22:26:07+00:00
  2660. can I be your puckish pet again?
    Muppe 2025-09-23T23:26:39+00:00
  2661. SNIKT! Marvel’s Wolverine finally makes triumphant return at new PS5 State of Play showcase
    PS5 – Destructoid 2025-09-24T21:49:11+00:00
  2662. > 199: I am building what I cannot break
    Laura Olin 2025-09-25T13:58:42+00:00
  2663. Forty-Four Esolangs: an artist's monograph of programming languages
    esoteric.codes 2025-09-26T13:37:00+00:00
  2664. Bruce Loose RIP
    Fujichia 2025-09-26T21:54:46+00:00
  2665. I made a newspaper
    The Rot 2025-09-26T23:45:45+00:00
  2666. untitled
    Jon Rafman 2025-09-27T02:09:34+00:00
  2667. Maisey Goes To Therapy
    Discworld MUD Dev Blog 2025-09-29T07:44:34+00:00
  2668. Bodhisattva
    Discworld MUD Dev Blog 2025-09-30T11:17:07+00:00
  2669. Thank you for being annoying
    Experimental History 2025-09-30T14:35:46+00:00
  2670. Glenn Ligon, “Condition Report” (2000)
    STML 2025-09-30T18:18:27+00:00
  2671. Ponder This Challenge - October 2025 - Counting Mazes
    IBM Ponder This 2025-09-30T23:00:00+00:00
  2672. How I read
    Escaping Flatland 2025-10-01T12:30:14+00:00
  2673. Obscure Emacs Package: ssh-config-mode
    Winny's Blog 2025-10-01T20:24:47+00:00
  2674. Video Game Thoughts Bonus Bag #7
    The Bottom Feeder 2025-10-01T21:05:56+00:00
  2675. The Age of Books and the Age of Brainrot
    Res Obscura 2025-10-02T13:45:15+00:00
  2676. How to fix Black Ops 7 ‘files cannot be managed in game by users on this platform’ error
    PS5 – Destructoid 2025-10-02T20:18:28+00:00
  2677. Which horse should you choose in Ghost of Yotei? Old Trails quest decision
    PS5 – Destructoid 2025-10-02T20:51:53+00:00
  2678. When is Ghost of Yotei coming to PC?
    PS5 – Destructoid 2025-10-02T21:25:30+00:00
  2679. Napoleon's Fiasco: The Last Days of the Haitian Revolution
    Steelsnowflake 2025-10-03T00:10:50+00:00
  2680. Book Review: Ulysses Audiobook
    Fujichia 2025-10-03T11:31:35+00:00
  2681. untitled
    Jon Rafman 2025-10-03T18:14:13+00:00
  2682. Unity Security Vulnerability Update
    Demon 2025-10-03T23:29:22+00:00
  2683. Birth of Prettier
    Vjeux 2025-10-04T20:33:06+00:00
  2684. Sick Reverie
    Steelsnowflake 2025-10-06T05:00:00+00:00
  2685. The smell of earth
    The Rot 2025-10-06T18:05:16+00:00
  2686. Debian Package Stats using Sqlite
    Winny's Blog 2025-10-06T23:38:17+00:00
  2687. Python 3.14 - Changes to look for
    Winny's Blog 2025-10-07T20:55:58+00:00
  2688. OUT ON THE DUNGEON FLOOR
    Infinite Gossip 2025-10-08T06:02:40+00:00
  2689. Agentic fragments
    Escaping Flatland 2025-10-08T13:25:01+00:00
  2690. The Rats in Look Outside: A spreading disease of fur and teeth
    Dark RPGs 2025-10-08T14:26:42+00:00
  2691. The Chicago three-dick salute
    Food is Stupid 2025-10-10T13:03:14+00:00
  2692. G-G on Facebook - G-G on Twitter
    garfield minus garfield 2025-10-12T19:15:11+00:00
  2693. Some usecases for GNU Units
    Winny's Blog 2025-10-12T19:33:05+00:00
  2694. Bamboozle me, daddy
    Experimental History 2025-10-14T15:48:04+00:00
  2695. Avernum 4: Greed and Glory Demo Out, Plus An Interview
    The Bottom Feeder 2025-10-14T18:50:35+00:00
  2696. List Of Topics Discussed
    Fujichia 2025-10-15T20:27:24+00:00
  2697. Meet Danny and his compost app Peels
    The Rot 2025-10-17T15:51:18+00:00
  2698. Another Coat Of Pain
    Discworld MUD Dev Blog 2025-10-18T15:15:18+00:00
  2699. What it’s like to walk across Massachusetts
    The Pudding 2025-10-20T05:00:00+00:00
  2700. One of my favourite paintings:
    STML 2025-10-21T06:21:56+00:00
  2701. “Terracotta anatomical votive; human eye.” 3rdC BC-1stC BC, Italy.
    STML 2025-10-21T06:38:40+00:00
  2702. Eye idol ca. 3700–3500 BCE On view at The Met Fifth Avenue in Gallery 202 This type of figurine…
    STML 2025-10-21T06:40:36+00:00
  2703. untitled
    STML 2025-10-21T06:45:40+00:00
  2704. Lover’s Eye, returned to Lender.
    STML 2025-10-21T06:46:12+00:00
  2705. The curious, contentious history of pumpkin spice lattes
    Snack Stack 2025-10-21T19:55:31+00:00
  2706. Sheathing And A Peek Ahead
    Discworld MUD Dev Blog 2025-10-22T05:15:07+00:00
  2707. A Case Of Scope Creep
    Discworld MUD Dev Blog 2025-10-22T07:40:34+00:00
  2708. Avernum 4: Greed And Glory Is Out!
    The Bottom Feeder 2025-10-22T13:25:55+00:00
  2709. Recent Spooky Movies
    Fujichia 2025-10-23T11:26:15+00:00
  2710. When is it better to think without words?
    Escaping Flatland 2025-10-23T12:10:48+00:00
  2711. This stunningly gorgeous open-world RPG from Korean devs looks too good to be true, and I’m hoping they prove me wrong
    PS5 – Destructoid 2025-10-23T21:25:15+00:00
  2712. I Can't Believe It's Not Butter! Chicken
    Food is Stupid 2025-10-24T13:02:49+00:00
  2713. text-mode:  [水墨]奔马图 by Gatchaman (2011).
    text-mode 2025-10-24T18:31:02+00:00
  2714. actegratuit: Mantras and Meditations by Meg Hitchcock
    text-mode 2025-10-25T18:31:07+00:00
  2715. My Door Is Aways Unlocked
    Discworld MUD Dev Blog 2025-10-26T16:10:13+00:00
  2716. text-mode: Zdeněk Sýkora’s ventilation tower and paintings....
    text-mode 2025-10-26T18:31:08+00:00
  2717. text-mode: The myth about Bird B (K. Holten & E. Mourier,...
    text-mode 2025-10-27T19:30:39+00:00
  2718. The Decline of Deviance
    Experimental History 2025-10-28T15:21:32+00:00
  2719. text-mode: ‘Grace Triptych’ by Keira Rathbone, 2011.
    text-mode 2025-10-28T19:30:33+00:00
  2720. There's Always A Catch.
    Discworld MUD Dev Blog 2025-10-29T15:10:59+00:00
  2721. text-mode: Jiří Valoch “Homage o Ladislav Novák”
    text-mode 2025-10-29T19:30:32+00:00
  2722. text-mode: Three mainstream computer heroes, rendered in their...
    text-mode 2025-10-30T19:31:02+00:00
  2723. I Can't Believe It's Not Butter! Or Chicken!
    Food is Stupid 2025-10-31T13:02:50+00:00
  2724. text-mode: Klaus Basset, Kubus, 1974. Typewriter graphics only...
    text-mode 2025-10-31T19:30:39+00:00
  2725. Ponder This Challenge - November 2025 - The CAT sequence
    IBM Ponder This 2025-11-01T00:00:00+00:00
  2726. The Epic Last Stand of Louis Delgrès: “Live Free or Die”
    Steelsnowflake 2025-11-01T12:18:43+00:00
  2727. text-mode: Space Harrier, text mode version. For the Japanese...
    text-mode 2025-11-01T19:30:36+00:00
  2728. text-mode: Carl Fernbach-Flarsheim - Boolean Image/Conceptual...
    text-mode 2025-11-02T19:30:53+00:00
  2729. In pursuit of democracy
    The Pudding 2025-11-03T06:00:00+00:00
  2730. We Need to Talk About Black Walnuts (Again)
    Midwesterner 2025-11-03T12:03:01+00:00
  2731. text-mode: Commodore 64 PETSCII acid wolf fax graphics by...
    text-mode 2025-11-03T19:30:56+00:00
  2732. A list of books and essays that I love
    Escaping Flatland 2025-11-04T11:27:24+00:00
  2733. New features of the EP-40 Riddim
    Spongefile 2025-11-04T14:23:33+00:00
  2734. text-mode: Kindergarten Paper Weavings from circa 1900...
    text-mode 2025-11-04T19:30:57+00:00
  2735. "get 1000 Mountains Of Ash"
    Discworld MUD Dev Blog 2025-11-05T05:10:38+00:00
  2736. A Lost IBM PC/AT Model? Analyzing a Newfound Old BIOS
    int10h.org - VileR's blog 2025-11-05T07:07:01+00:00
  2737. Introducing: Springfield-Style Black Walnut Chicken
    Midwesterner 2025-11-05T12:02:52+00:00
  2738. The PS5 handheld’s latest update may have just made it a must-buy for the holiday season
    PS5 – Destructoid 2025-11-05T15:36:53+00:00
  2739. Lioconcha hieroglyphica is a saltwater clam that makes cellular automata-style patterns in some kind…
    text-mode 2025-11-05T19:30:36+00:00
  2740. Four Ways To Make Your Turn-Based Game More Interesting (Or Ruin It)
    The Bottom Feeder 2025-11-05T20:50:09+00:00
  2741. > 200: We were trying to live a personal life
    Laura Olin 2025-11-06T14:32:22+00:00
  2742. Can automation help make the humanities more human?
    Res Obscura 2025-11-06T23:08:09+00:00
  2743. Lighting The Way
    Discworld MUD Dev Blog 2025-11-07T05:42:22+00:00
  2744. The Black Walnut Snack Pack
    Midwesterner 2025-11-07T12:03:36+00:00
  2745. Shrimp cocktail
    Food is Stupid 2025-11-07T14:03:17+00:00
  2746. A Heavier Coat Of Pain
    Discworld MUD Dev Blog 2025-11-08T13:43:40+00:00
  2747. It's not always easy to leave your leaves
    The Rot 2025-11-10T00:27:36+00:00
  2748. THE HANDSOME BROTHERS
    Infinite Gossip 2025-11-10T05:57:26+00:00
  2749. this guy sucks at throwing
    Terrible Banana 2025-11-11T18:38:24+00:00
  2750. You say potato, I say leprosy
    Experimental History 2025-11-12T00:45:53+00:00
  2751. The McSlug
    Food is Stupid 2025-11-14T14:03:38+00:00
  2752. How quake.exe got its TCP/IP stack
    Fabien Sanglard 2025-11-17T00:00:00+00:00
  2753. The Time I Annoyed Lord British and He Gave Me His Debris
    The Bottom Feeder 2025-11-17T22:20:29+00:00
  2754. How well can Gemini 3 make a Henry James simulator?
    Res Obscura 2025-11-19T00:27:38+00:00
  2755. “We strike a balance of what we call a “grounded openness” that avoids the traps of provincialism…
    STML 2025-11-19T10:05:34+00:00
  2756. so a very long time ago, my dad worked with an arson investigator
    STML 2025-11-19T10:06:55+00:00
  2757. When I accept myself just as I am, I change
    Escaping Flatland 2025-11-19T10:46:28+00:00
  2758. Talk To Me
    Fujichia 2025-11-19T18:05:14+00:00
  2759. Evil pizza
    Food is Stupid 2025-11-21T14:03:38+00:00
  2760. Please, Support Books
    ignorethecode.net 2025-11-22T14:21:51+00:00
  2761. Grimdark JRPGs for fans of Fear & Hunger
    Dark RPGs 2025-11-23T14:07:13+00:00
  2762. Avatar: The Last Airbender Draft, Round 2 (WUBRG Drafting)
    Mediocre Magic 2025-11-23T15:35:00+00:00
  2763. Quake Engine Indicators
    Fabien Sanglard 2025-11-24T00:00:00+00:00
  2764. Frepack Draft: Clue and Explorers of Ixalan
    Mediocre Magic 2025-11-24T01:42:00+00:00
  2765. Composting textiles with Everybody.World
    The Rot 2025-11-24T19:33:37+00:00
  2766. Understanding The Player Brain, Pt. 1: Loss Avoidance
    The Bottom Feeder 2025-11-24T21:42:31+00:00
  2767. Zombie cakes, the dead dessert of the 1950s
    Snack Stack 2025-11-24T23:06:45+00:00
  2768. Secrets of the ancient memelords
    Experimental History 2025-11-25T19:17:23+00:00
  2769. Year Of Me
    Mighty Vision 2025-11-25T23:07:00+00:00
  2770. Year Of Me
    Mighty Vision 2025-11-25T23:07:00+00:00
  2771. Generalized Worley Noise
    Ian Henry 2025-11-26T00:00:00+00:00
  2772. How the EP series fader works
    Spongefile 2025-11-26T10:23:43+00:00
  2773. Hogswatch Timing
    Discworld MUD Dev Blog 2025-11-27T20:48:43+00:00
  2774. Wonderful Toolchain project update - November 2025
    Posts on asie's blog 2025-11-30T00:00:00+00:00
  2775. Periodic Spaces
    Ian Henry 2025-11-30T00:00:00+00:00
  2776. Ponder This Challenge - December 2025 - Sums of a prime and an even number
    IBM Ponder This 2025-11-30T22:15:00+00:00
  2777. Electricity for fun (and mechatronics teachers)
    Spongefile 2025-12-02T00:33:00+00:00
  2778. Just and loving seeing
    Escaping Flatland 2025-12-02T12:55:45+00:00
  2779. Why WinQuake exists and how it works
    Fabien Sanglard 2025-12-03T00:00:00+00:00
  2780. Ravnica Clue + Avatar: TLA Beginner's Box
    Mediocre Magic 2025-12-03T23:55:00+00:00
  2781. Why I have been writing a niche history blog for 15 years
    Res Obscura 2025-12-04T18:43:04+00:00
  2782. > 201: A taxi cab floating across three lanes with its lamp lit
    Laura Olin 2025-12-04T19:43:07+00:00
  2783. Elattes
    Food is Stupid 2025-12-05T14:02:44+00:00
  2784. Teenage Engineering connection cheat sheets
    Spongefile 2025-12-05T14:38:47+00:00
  2785. The sunny snack from Azerbaijan
    Snack Stack 2025-12-05T20:00:24+00:00
  2786. A Full Pod of Chaos (WUBRG Drafting)
    Mediocre Magic 2025-12-07T01:02:00+00:00
  2787. Feed the Soil (and the rest will follow)
    Wild Information 2025-12-07T22:35:34+00:00
  2788. The hidden Superbosses of Look Outside (till v2.1)
    Dark RPGs 2025-12-08T20:19:51+00:00
  2789. Common Threads
    The Pudding 2025-12-09T06:00:00+00:00
  2790. The drug that taught me how much I should suffer
    Experimental History 2025-12-09T17:46:17+00:00
  2791. Sean Sherman's Pápa Waháŋpi
    Midwesterner 2025-12-09T22:37:36+00:00
  2792. [Outliers] Bernie Marcus: The Home Depot Story
    Farnam Street 2025-12-11T10:30:00+00:00
  2793. Ask a Midwesterner: Why Can't I Find a Bowl of Pápa Waháŋpi?
    Midwesterner 2025-12-11T12:03:07+00:00
  2794. Reflections on my first year writing full time
    Escaping Flatland 2025-12-11T17:12:02+00:00
  2795. Video Game Thoughts Bonus Bag #8
    The Bottom Feeder 2025-12-11T22:18:47+00:00
  2796. Haiku Activity & Contract Report, November 2025 (ft. Go)
    Haiku Project 2025-12-12T21:20:00+00:00
  2797. Microbes at work
    The Rot 2025-12-16T17:43:34+00:00
  2798. FIRST OF JUNE
    Infinite Gossip 2025-12-17T04:34:44+00:00
  2799. Game console prices reached unfortunate highs in 2025, and November sales hit rockbottom because of it
    PS5 – Destructoid 2025-12-17T17:17:13+00:00
  2800. events this week!!
    Fujichia 2025-12-17T23:00:08+00:00
  2801. Year 12 of the Cogmind
    Grid Sage Games 2025-12-18T02:54:36+00:00
  2802. Be Your Best in 2026: The Most Important Lessons from The Knowledge Project (2025)
    Farnam Street 2025-12-18T10:30:00+00:00
  2803. The closest thing we might ever get to a new Dino Crisis game is coming in a few weeks, but I’m extremely skeptical
    PS5 – Destructoid 2025-12-18T16:03:52+00:00
  2804. WHAT I LEARNED ABOUT PUBLISHING SHORT STORIES ONLINE IN 2025
    Infinite Gossip 2025-12-19T03:34:53+00:00
  2805. Nog Eggs
    Food is Stupid 2025-12-19T14:03:10+00:00
  2806. The Gerrit code review iceberg, episode 3
    Haiku Project 2025-12-19T20:30:00+00:00
  2807. Two Multiplayer Mostly-Avatar Drafts (WUBRG Drafting)
    Mediocre Magic 2025-12-21T04:53:00+00:00
  2808. Pfeffernög
    Midwesterner 2025-12-22T12:00:29+00:00
  2809. "AI" is bad UX
    Apperceptive by Sam 2025-12-22T16:34:01+00:00
  2810. Chaos Collects Clues (WUBRG Drafting)
    Mediocre Magic 2025-12-24T00:01:00+00:00
  2811. The Outlier Playbook: The Patterns Behind Enduring Success
    Farnam Street 2025-12-25T10:30:00+00:00
  2812. Top ten composts this year
    The Rot 2025-12-26T21:51:20+00:00
  2813. Pierre Poilievre on the Role of Government, Freedom, and Affordability
    Farnam Street 2025-12-27T12:13:05+00:00
  2814. Those Curious Naturalists
    Wild Information 2025-12-28T16:39:01+00:00
  2815. worldsofzzt: Source “Toypole” by Agent Orange (2022) Published...
    text-mode 2025-12-29T11:22:48+00:00
  2816. untitled
    text-mode.org 2025-12-29T14:37:26+00:00
  2817. untitled
    text-mode.org 2025-12-29T14:45:46+00:00
  2818. untitled
    text-mode.org 2025-12-29T15:10:13+00:00
  2819. untitled
    text-mode.org 2025-12-29T15:24:12+00:00
  2820. untitled
    text-mode.org 2025-12-29T15:35:29+00:00
  2821. Typewriter works by Montserrat Alberich Escardívol (1912-1973). Her first exhibition was in 1929…
    text-mode 2025-12-29T19:30:37+00:00
  2822. The Top Ways Video Games Affect Your Brain. Number Five May Disturb You!
    The Bottom Feeder 2025-12-29T21:43:22+00:00
  2823. Ponder This Challenge - January 2026 - Number splitting
    IBM Ponder This 2025-12-29T22:15:00+00:00
  2824. Ravnica Clue EDH over SpellTable
    Mediocre Magic 2025-12-30T03:22:00+00:00
  2825. Various works by Haji, 1998-2001. via 16colors
    text-mode 2025-12-30T19:30:29+00:00
  2826. Between Ruin & Repair
    City of Yes 2025-12-31T15:30:39+00:00
  2827. Anamie by Hack n’ Trade and Razor 1991. A PC-demo based on Amiga ASCII and some custom characters.
    text-mode 2025-12-31T19:30:31+00:00
  2828. The Gerrit code review iceberg, episode 4
    Haiku Project 2025-12-31T22:30:00+00:00
  2829. James Clear: How to Build Good Habits & Break Bad Ones
    Farnam Street 2026-01-01T10:30:00+00:00
  2830. > 202: What resonated, 2025
    Laura Olin 2026-01-01T15:36:44+00:00
  2831. Näyttää Betonilta by Duce, 2025. C64 PETSCII, inspired by Odeith.
    text-mode 2026-01-01T19:30:31+00:00
  2832. The Power of the Image. Artists vs fascists
    We Make Money Not Art 2026-01-02T14:48:19+00:00
  2833. fungi.neocities.org is a site by Polyducks, who’s been featured here many times before. It features…
    text-mode 2026-01-02T19:30:34+00:00
  2834. Textual Paint is a textmode version of MS Paint that runs in the terminal. Made by Isaiah Odhner.
    text-mode 2026-01-03T19:30:29+00:00
  2835. Interview with yayimhere
    esoteric.codes 2026-01-05T04:06:00+00:00
  2836. The dating industry is weird.
    The Curiosity Cabinet 2026-01-06T03:26:05+00:00
  2837. The secrets of human-animal hybrids escaping from a SCP-like facility: Interview with RE Atelier, the team behind the great JRPG Break Wolf
    Dark RPGs 2026-01-06T11:20:33+00:00
  2838. How to be less awkward
    Experimental History 2026-01-06T16:45:44+00:00
  2839. Being creative requires taking risks
    Escaping Flatland 2026-01-07T11:59:34+00:00
  2840. Chaos is Clued In (WUBRG Drafting)
    Mediocre Magic 2026-01-07T21:46:00+00:00
  2841. Building a 1997 Quake PC!
    Fabien Sanglard 2026-01-08T00:00:00+00:00
  2842. [Outliers] The Multidisciplinary Approach to Thinking | Peter D. Kaufman
    Farnam Street 2026-01-08T10:18:00+00:00
  2843. Chaos Jumpstart Clue
    Mediocre Magic 2026-01-08T15:37:00+00:00
  2844. Algorithmic hover states with contrast-color()
    daverupert.com 2026-01-08T16:46:00+00:00
  2845. Using your design system colors with contrast-color()
    daverupert.com 2026-01-09T03:21:00+00:00
  2846. untitled
    https://jennifermillsnews.tumblr.com/ 2026-01-09T13:18:28+00:00
  2847. untitled
    text-mode.org 2026-01-09T14:00:19+00:00
  2848. Interpolate contrast-color() to manipulate lightness
    daverupert.com 2026-01-09T15:35:00+00:00
  2849. Only Connect
    City of Yes 2026-01-09T16:16:21+00:00
  2850. The computer is the key to love.
    The Curiosity Cabinet 2026-01-09T20:22:58+00:00
  2851. Murder
    Mark Bernstein 2026-01-10T15:25:05+00:00
  2852. Focus rings with nested contrast-color()?
    daverupert.com 2026-01-11T18:30:00+00:00
  2853. 0.34 Trunk Update and Tournament Announcement
    Dungeon Crawl Stone Soup 2026-01-11T20:04:32+00:00
  2854. Building a 1997 Quake PC: Benchmarking Quake
    Fabien Sanglard 2026-01-12T00:00:00+00:00
  2855. Atlas of Borders. Walls, Migrations and Conflict in 70 Maps
    We Make Money Not Art 2026-01-12T09:52:44+00:00
  2856. Disunion
    Mark Bernstein 2026-01-12T13:00:52+00:00
  2857. Names
    Mark Bernstein 2026-01-12T13:15:38+00:00
  2858. Questions
    Mark Bernstein 2026-01-12T13:29:51+00:00
  2859. Building a 1997 Quake PC: Benchmarking Vquake
    Fabien Sanglard 2026-01-13T00:00:00+00:00
  2860. Haiku Activity & Contract Report, December 2025
    Haiku Project 2026-01-13T01:00:00+00:00
  2861. Building a 1997 Quake PC: Benchmarking GLquake
    Fabien Sanglard 2026-01-14T00:00:00+00:00
  2862. HUD, History, and What’s Ahead
    City of Yes 2026-01-14T14:02:53+00:00
  2863. Morgan Housel: Wealth is What You Have Minus What You Want
    Farnam Street 2026-01-15T10:30:00+00:00
  2864. On the preparations before writing an essay
    Escaping Flatland 2026-01-15T11:03:57+00:00
  2865. What making community compost means now
    The Rot 2026-01-15T15:46:49+00:00
  2866. Is QSpy still cool? Let's play QuakeWorld!
    Fabien Sanglard 2026-01-16T00:00:00+00:00
  2867. untitled
    https://jennifermillsnews.tumblr.com/ 2026-01-16T13:16:27+00:00
  2868. When “Just One More Lane” Runs Out of Road
    City of Yes 2026-01-16T14:25:23+00:00
  2869. 150 questions to fall in love.
    The Curiosity Cabinet 2026-01-16T16:20:28+00:00
  2870. MACGUFFIN WORLD
    Infinite Gossip 2026-01-17T04:36:10+00:00
  2871. Not only Nemesis and Mr X: immortal stalkers and chasing enemies in turn-based JRPGs [Updated Jan 2026]
    Dark RPGs 2026-01-17T09:32:55+00:00
  2872. A Hedgehog in a Fox’s World: Paul Kingsnorth’s "Against the Machine"
    Steelsnowflake 2026-01-17T14:09:17+00:00
  2873. Lorwyn Eclipsed Prerelease
    Mediocre Magic 2026-01-17T21:51:00+00:00
  2874. The best version of my site so far...
    daverupert.com 2026-01-18T14:29:00+00:00
  2875. On Canned Tomatoes (They Can Be Pretty Awesome)
    Maybe Pizza? 2026-01-18T22:54:47+00:00
  2876. Jews and Words
    Mark Bernstein 2026-01-19T14:35:38+00:00
  2877. La Société Automatique. Reminding us that the tech industry is based on myths as much as on science
    We Make Money Not Art 2026-01-19T15:20:50+00:00
  2878. MBBP
    Midwesterner 2026-01-20T12:03:04+00:00
  2879. Text is king
    Experimental History 2026-01-20T18:55:47+00:00
  2880. FOSDEM 2026
    Haiku Project 2026-01-22T07:00:00+00:00
  2881. [Outliers] Ray Kroc: How McDonald’s Took Over America
    Farnam Street 2026-01-22T10:30:00+00:00
  2882. The Age of Assholes
    City of Yes 2026-01-22T14:03:14+00:00
  2883. A Psalm for the Wild-Built
    Mark Bernstein 2026-01-22T19:50:31+00:00
  2884. Lorwyn Eclipsed Clue Draft (WUBRG Drafting)
    Mediocre Magic 2026-01-23T02:30:00+00:00
  2885. untitled
    STML 2026-01-23T10:47:55+00:00
  2886. untitled
    https://jennifermillsnews.tumblr.com/ 2026-01-23T13:10:37+00:00
  2887. Oysters jubilee
    Food is Stupid 2026-01-23T14:03:17+00:00
  2888. Omikron: The Nomad Soul
    The Digital Antiquarian 2026-01-23T16:34:02+00:00
  2889. Video Game Thoughts Bonus Bag #9
    The Bottom Feeder 2026-01-23T19:31:30+00:00
  2890. Ravnica Clue with Assorted Jumpstart Packs
    Mediocre Magic 2026-01-24T02:22:00+00:00
  2891. The Value of Things
    journal.stuffwithstuff.com 2026-01-24T08:00:00+00:00
  2892. Waiting for the power to go out
    daverupert.com 2026-01-24T21:44:00+00:00
  2893. A beauty pageant, judged by a computer.
    The Curiosity Cabinet 2026-01-25T13:31:07+00:00
  2894. Porting 100k lines from TypeScript to Rust using Claude Code in a month
    Vjeux 2026-01-25T17:42:47+00:00
  2895. Lorwyn Eclipsed Draft (WUBRG Drafting)
    Mediocre Magic 2026-01-26T02:27:00+00:00
  2896. I'm swearing off APIs entirely
    daverupert.com 2026-01-26T06:27:00+00:00
  2897. I know your secret
    Experimental History 2026-01-27T16:22:32+00:00
  2898. The Bullet That Missed
    Mark Bernstein 2026-01-27T17:48:24+00:00
  2899. Ponder This Challenge - February 2026 - Blot-avoiding backgammon strategy
    IBM Ponder This 2026-01-27T22:00:00+00:00
  2900. On political power
    Escaping Flatland 2026-01-28T12:03:38+00:00
  2901. 0.34 Tournament Page and Trunk Update
    Dungeon Crawl Stone Soup 2026-01-29T03:23:59+00:00
  2902. Michael Ovitz: The Psychology of Power
    Farnam Street 2026-01-29T10:30:00+00:00
  2903. Repost: Jerry's Apartment
    City of Yes 2026-01-29T14:02:37+00:00
  2904. It's still what computers still can't do
    Apperceptive by Sam 2026-01-29T21:06:22+00:00
  2905. untitled
    https://jennifermillsnews.tumblr.com/ 2026-01-30T13:18:28+00:00
  2906. A wedding in a peculiar venue.
    The Curiosity Cabinet 2026-01-30T13:42:53+00:00
  2907. Honey bunion rings
    Food is Stupid 2026-01-30T14:03:29+00:00
  2908. The Cookie Theory of Collective Action
    Snack Stack 2026-01-30T21:52:40+00:00
  2909. Frepack Draft #(n+12) (WUBRG Drafting)
    Mediocre Magic 2026-02-02T01:19:00+00:00
  2910. From $0 to $100 million in DonutSMP over the weekend
    Vjeux 2026-02-02T03:52:33+00:00
  2911. The Gerrit code review iceberg, episode 5
    Haiku Project 2026-02-03T12:30:00+00:00
  2912. Underrated ways to change the world, vol. II
    Experimental History 2026-02-03T16:28:19+00:00
  2913. Launching The Rural Guaranteed Minimum Income Initiative
    Coding Horror 2026-02-04T07:43:56+00:00
  2914. Write about the future you want
    daverupert.com 2026-02-04T15:45:00+00:00
  2915. Let's compile Quake like it's 1997!
    Fabien Sanglard 2026-02-05T00:00:00+00:00
  2916. > 203: I stand at the lip of a pouting valley—SPEAK TO ME!
    Laura Olin 2026-02-05T13:04:03+00:00
  2917. 0.34 “Doomed Geometries”
    Dungeon Crawl Stone Soup 2026-02-06T02:33:27+00:00
  2918. Things that connect us to ourselves, and things that don't
    Escaping Flatland 2026-02-06T11:54:11+00:00
  2919. untitled
    https://jennifermillsnews.tumblr.com/ 2026-02-06T13:12:29+00:00
  2920. A Visit to the Bentham Project at University College London (UCL)
    Practical Ethics 2026-02-06T14:46:08+00:00
  2921. Ultima IX
    The Digital Antiquarian 2026-02-06T17:09:05+00:00
  2922. You can just dig a hole.
    The Rot 2026-02-07T22:53:25+00:00
  2923. Superb Snack History: The secret life of seven-layer dip
    Snack Stack 2026-02-08T20:51:43+00:00
  2924. Magic Words
    daverupert.com 2026-02-09T16:03:00+00:00
  2925. Chaos needs to read the whole card (WUBRG Drafting)
    Mediocre Magic 2026-02-10T02:31:00+00:00
  2926. CHRISTOPHER'S STRONG LEGS
    Infinite Gossip 2026-02-10T06:05:12+00:00
  2927. Nicolai Tangen: The $2 Trillion Mind
    Farnam Street 2026-02-12T10:30:00+00:00
  2928. Gemini's Hypothetical Present
    Jeff Kaufman 2026-02-12T13:00:00+00:00
  2929. Tokyo: The Megacity at Human Scale
    City of Yes 2026-02-12T14:02:48+00:00
  2930. Zone2Source, a testing ground for art and ecology
    We Make Money Not Art 2026-02-12T14:22:17+00:00
  2931. What Exact Products Do Games Sell, Two Case Studies
    The Bottom Feeder 2026-02-12T19:00:18+00:00
  2932. Haiku Activity & Contract Report, January 2026
    Haiku Project 2026-02-13T03:00:00+00:00
  2933. untitled
    https://jennifermillsnews.tumblr.com/ 2026-02-13T13:10:28+00:00
  2934. AI: New Frontiers
    Mark Bernstein 2026-02-13T16:39:17+00:00
  2935. How Michael Abrash doubled Quake framerate
    Fabien Sanglard 2026-02-14T00:00:00+00:00
  2936. The New Roman Empire: A History of Byzantium
    Mark Bernstein 2026-02-14T02:13:48+00:00
  2937. Advertising for Love
    The Curiosity Cabinet 2026-02-14T13:02:55+00:00
  2938. It’s In The Blood
    Demon 2026-02-14T23:41:35+00:00
  2939. Text Posts from the Kids Group: 2025
    Jeff Kaufman 2026-02-15T13:00:00+00:00
  2940. We need to continue to sing this song
    gilest.org 2026-02-15T19:14:33+00:00
  2941. 1998 Ebook!
    The Digital Antiquarian 2026-02-16T13:32:30+00:00
  2942. Mental Health Chatbots: on Truth and Bullshit
    Practical Ethics 2026-02-16T15:51:58+00:00
  2943. I swear the UFO is coming any minute
    Experimental History 2026-02-17T16:15:11+00:00
  2944. Sizing Chaos
    The Pudding 2026-02-18T06:00:00+00:00
  2945. The need to make art
    Escaping Flatland 2026-02-18T10:42:17+00:00
  2946. Rethinking the Ethics and Politics of the Global Campaign Against Female Genital Cutting
    Practical Ethics 2026-02-18T13:36:54+00:00
  2947. What is happening to writing?
    Res Obscura 2026-02-18T14:52:04+00:00
  2948. [Outliers] Phil Knight: The Obsession That Built Nike
    Farnam Street 2026-02-19T10:30:00+00:00
  2949. You May Already Be Canadian
    Jeff Kaufman 2026-02-19T13:00:00+00:00
  2950. Everything’s in the Shitter
    City of Yes 2026-02-19T15:02:47+00:00
  2951. New Book: Protecting Minds – The Right Against Mental Interference
    Practical Ethics 2026-02-20T12:55:50+00:00
  2952. untitled
    https://jennifermillsnews.tumblr.com/ 2026-02-20T13:18:28+00:00
  2953. House-smoked tuna
    Food is Stupid 2026-02-20T14:03:26+00:00
  2954. Gabriel Knight 3: Blood of the Sacred, Blood of the Damned
    The Digital Antiquarian 2026-02-20T17:24:46+00:00
  2955. Chaos Commander Draft
    Mediocre Magic 2026-02-20T20:56:00+00:00
  2956. Google Summer of Code 2026
    Neovim 2026-02-21T00:00:00+00:00
  2957. Haiku to mentor interns in Google Summer of Code 2026
    Haiku Project 2026-02-21T11:47:20+00:00
  2958. AI: Polite?
    Mark Bernstein 2026-02-22T01:56:21+00:00
  2959. Storing Food
    Jeff Kaufman 2026-02-22T13:00:00+00:00
  2960. Looking for a man who won't "ogle."
    The Curiosity Cabinet 2026-02-22T14:15:40+00:00
  2961. Priority of idle hands
    daverupert.com 2026-02-23T03:13:00+00:00
  2962. Smaller and dumber
    daverupert.com 2026-02-23T05:33:00+00:00
  2963. Dollar Country 259: Country, Bluegrass, and Gospel from North Carolina
    Dollar Country Newsletter & Radio Show 2026-02-23T20:08:35+00:00
  2964. The Secret History of Knocking on Wood
    Res Obscura 2026-02-24T14:12:37+00:00
  2965. Reflecting on Self (human and AI)
    ntoll.org 2026-02-24T18:00:00+00:00
  2966. 0.34 Tournament Results
    Dungeon Crawl Stone Soup 2026-02-25T01:07:54+00:00
  2967. Happy Map
    The Pudding 2026-02-25T06:00:00+00:00
  2968. Be prepared for a cardtastic event…. a competition…. where even games may come true … !…
    Muppe 2026-02-25T13:39:51+00:00
  2969. Inside the Mind of Robinhood Co-Founder Vlad Tenev
    Farnam Street 2026-02-26T10:30:00+00:00
  2970. Getting a better sense for when you’re thinking well and when you’re faking it
    Escaping Flatland 2026-02-26T11:39:21+00:00
  2971. Why “Plants Have Feelings Too” Is a Terrible Argument Against Veganism
    Steelsnowflake 2026-02-26T12:40:13+00:00
  2972. Slaloming Towards Olympus
    City of Yes 2026-02-26T14:59:23+00:00
  2973. Here's to the Polypropylene Makers
    Jeff Kaufman 2026-02-27T13:00:00+00:00
  2974. untitled
    https://jennifermillsnews.tumblr.com/ 2026-02-27T13:20:29+00:00
  2975. Beefed corn
    Food is Stupid 2026-02-27T14:01:03+00:00
  2976. ‘It’s Physical, Not Intellectual’: The Ethics of Correcting Assumptions About Disability
    Practical Ethics 2026-02-27T14:31:50+00:00
  2977. We The Bacteria. Notes Toward Biotic Architecture
    We Make Money Not Art 2026-02-27T14:38:48+00:00
  2978. Make things OpenSAFELY, it makes things better
    gilest.org 2026-02-27T15:42:55+00:00
  2979. Be prepared for a cardtastic event…. a competition…. where even games may come true … !…
    Muppe 2026-02-28T02:06:41+00:00
  2980. WE ARE GOING LIVE WITH CARDWRIGHTS CARDS WILL BE BORN HOSTED BY:
    Muppe 2026-02-28T02:06:53+00:00
  2981. Ponder This Challenge - March 2026 - Path game on a hole-riddled chessboard
    IBM Ponder This 2026-02-28T22:00:00+00:00
  2982. Introducing and Deprecating WoFBench
    Jeff Kaufman 2026-03-01T13:00:00+00:00
  2983. Balloons, bras, and body slams.
    The Curiosity Cabinet 2026-03-01T14:06:42+00:00
  2984. Is Prostitution Just a Job?
    Practical Ethics 2026-03-02T11:35:29+00:00
  2985. Can Pornography be Feminist in a Mass Market Economy?
    Practical Ethics 2026-03-02T12:01:43+00:00
  2986. AI: afternoon, with Claude
    Mark Bernstein 2026-03-02T17:43:17+00:00
  2987. ⛵️ Painting with Rebelle 8 Pro
    Nathan Youngman 2026-03-03T00:00:00+00:00
  2988. Public Listening: Jamie Lee's
    Fujichia 2026-03-03T15:16:46+00:00
  2989. The one science reform we can all agree on, but we're too cowardly to do
    Experimental History 2026-03-03T17:47:57+00:00
  2990. With Us or Against Us, Again
    The Present Age 2026-03-03T23:21:35+00:00
  2991. 💔 Animating with Moho 14.4
    Nathan Youngman 2026-03-04T00:00:00+00:00
  2992. Dollar Country: The Missing Episodes (256, 257, 258)
    Dollar Country Newsletter & Radio Show 2026-03-04T21:14:34+00:00
  2993. 🎹 Learning to Play
    Nathan Youngman 2026-03-05T00:00:00+00:00
  2994. [Outliers] J.W. Marriott: Building an Empire Without a Master Plan
    Farnam Street 2026-03-05T10:30:00+00:00
  2995. > 204: At least he didn't get Earl
    Laura Olin 2026-03-05T14:06:14+00:00
  2996. The Freedom of the City
    City of Yes 2026-03-05T16:04:14+00:00
  2997. A Detailed Review of Like 8% of Mewgenics
    The Bottom Feeder 2026-03-05T21:08:36+00:00
  2998. Haiku Activity & Contract Report, February 2026
    Haiku Project 2026-03-06T01:30:00+00:00
  2999. untitled
    https://jennifermillsnews.tumblr.com/ 2026-03-06T12:11:23+00:00
  3000. Chaos drafts with Ooze (WUBRG Drafting)
    Mediocre Magic 2026-03-06T13:27:00+00:00
  3001. A rudimentary paste
    Food is Stupid 2026-03-06T14:03:52+00:00
  3002. The Mystery of Rennes-le-Château, Part 1: The Priest’s Treasure
    The Digital Antiquarian 2026-03-06T16:50:42+00:00
  3003. hi hi! just wanted to pop in to say your guide to regency names is a godsend as an amateur regency historian and hopeful romance writer!! the tier system is perfect, and it’s made me feel better about all the characters I slap the name Mary onto. did you ever get around to doing some work on nicknames and/or accurate surnames? I’d love to hear of any primary sources you have for those two topics!
    Ye Olde News 2026-03-06T19:51:12+00:00
  3004. Noem Reassigned to Made-up Position, WAR-slash-Epstein Files or Real Terrorism Concern, & Panic at the Pump
    Kareem Abdul-Jabbar 2026-03-07T11:00:26+00:00
  3005. 1500 Regency Era Last Names
    Ye Olde News 2026-03-07T16:30:21+00:00
  3006. Chore Standards
    Jeff Kaufman 2026-03-08T13:00:00+00:00
  3007. France is Bacon, Organic Idiocy and the Chinese Room
    ntoll.org 2026-03-08T21:00:00+00:00
  3008. On the problem of landscape tarp
    The Rot 2026-03-09T01:01:28+00:00
  3009. untitled
    text-mode.org 2026-03-09T14:55:16+00:00
  3010. Dollar Country 260: Waltzes & 2-Steps (All Cajun no. 3)
    Dollar Country Newsletter & Radio Show 2026-03-09T16:02:44+00:00
  3011. Billionaires Influence Elections, Boys With Pocketfuls of Cash, & China's Nuke 'em All Test?
    Kareem Abdul-Jabbar 2026-03-10T10:03:23+00:00
  3012. Conflicted on Ramsey
    Jeff Kaufman 2026-03-10T13:00:00+00:00
  3013. Schrödinger's War
    The Present Age 2026-03-10T15:35:10+00:00
  3014. Some relationships deepen when you tell the truth and some end
    Escaping Flatland 2026-03-11T08:27:40+00:00
  3015. How Many Parking Permits?
    Jeff Kaufman 2026-03-11T13:00:00+00:00
  3016. Design Beyond the Human. Transdisciplinary Conversations about the Planet
    We Make Money Not Art 2026-03-11T14:04:00+00:00
  3017. Why I (Mostly) Stopped Posting To Youtube
    Dollar Country Newsletter & Radio Show 2026-03-11T15:40:24+00:00
  3018. Queen's Wish: A Portmortem Of Mixed Success
    The Bottom Feeder 2026-03-11T20:32:34+00:00
  3019. untitled
    STML 2026-03-12T09:05:19+00:00
  3020. Brookfield CEO Connor Teskey: AI Infrastructure, Data Centers, and the Future of Investing
    Farnam Street 2026-03-12T09:30:00+00:00
  3021. urgent mutual aid request for immigrant family
    Muppe 2026-03-12T18:04:25+00:00
  3022. untitled
    https://jennifermillsnews.tumblr.com/ 2026-03-13T12:12:49+00:00
  3023. Paved with Gold: The Hidden Costs of Free Transportation
    City of Yes 2026-03-13T13:01:08+00:00
  3024. The Patriotic Press
    The Present Age 2026-03-13T17:46:45+00:00
  3025. Double Standard at the Top, When One Tremor Ripples Across the Globe, & The Quiet Crisis in America’s Living Rooms
    Kareem Abdul-Jabbar 2026-03-14T10:02:44+00:00
  3026. Photoshops Without Explication
    Fujichia 2026-03-14T21:19:57+00:00
  3027. Foundations Draft #2 (WUBRG Drafting)
    Mediocre Magic 2026-03-15T03:26:00+00:00
  3028. 0.34.1 Bugfix Release
    Dungeon Crawl Stone Soup 2026-03-15T04:31:16+00:00
  3029. One hundred curl graphs
    daniel.haxx.se 2026-03-15T10:42:45+00:00
  3030. They'll Go First
    The Present Age 2026-03-16T22:08:08+00:00
  3031. Wreckage of Iran Air Flight 655, 1988.
    STML 2026-03-17T08:55:39+00:00
  3032. Robert Breer, Rug (1969)
    STML 2026-03-17T09:10:55+00:00
  3033. Zero Evidence Against Powell, When Asylum Isn't Asylum, & The Mayor Who Won't Bend
    Kareem Abdul-Jabbar 2026-03-17T10:02:55+00:00
  3034. Illustration of a 1982 Perfect Writer. “The manual accompanying Perfect Writer came with a fanciful…
    STML 2026-03-17T10:23:20+00:00
  3035. Help I'm being persecuted
    Experimental History 2026-03-17T13:33:13+00:00
  3036. COMPOST AFTER READING is officially OUT!
    The Rot 2026-03-17T18:09:44+00:00
  3037. untitled
    STML 2026-03-17T18:55:12+00:00
  3038. The Landscape Architecture of Auroras on Demand
    BLDGBLOG 2026-03-18T18:23:31+00:00
  3039. "What if We Didn't Suck?"
    The Present Age 2026-03-18T19:16:44+00:00
  3040. A Journey Through Infertility
    The Pudding 2026-03-19T05:00:00+00:00
  3041. [Outliers] Harrison McCain: How to Create Demand for Something Nobody Wants
    Farnam Street 2026-03-19T09:30:00+00:00
  3042. Disorder in the Liberal City
    City of Yes 2026-03-19T13:36:46+00:00
  3043. International CGT in Japan, Day One Talks
    Combinatorial Game Theory 2026-03-19T20:36:00+00:00
  3044. Know where your codes are
    gilest.org 2026-03-19T21:48:48+00:00
  3045. untitled
    https://jennifermillsnews.tumblr.com/ 2026-03-20T12:26:29+00:00
  3046. People are not friction
    daverupert.com 2026-03-20T15:54:00+00:00
  3047. The Mystery of Rennes-le-Château, Part 2: Secret Codes and Hidden Messages
    The Digital Antiquarian 2026-03-20T18:09:52+00:00
  3048. International CGT in Japan, Day Two
    Combinatorial Game Theory 2026-03-20T22:43:00+00:00
  3049. Putin’s Bubble Gets Smaller, Bondi Subpoenaed, & Cool It on the Judge Attacks, Says Chief Justice
    Kareem Abdul-Jabbar 2026-03-21T10:02:50+00:00
  3050. bye bye RTMP
    daniel.haxx.se 2026-03-21T14:06:12+00:00
  3051. The most deranged maniacs invading your world as Red Phantoms in Cradle of Nightmare
    Dark RPGs 2026-03-21T19:17:50+00:00
  3052. Wonderful Toolchain project update - March 2026
    Posts on asie's blog 2026-03-22T00:00:00+00:00
  3053. NTLM and SMB go opt-in
    daniel.haxx.se 2026-03-22T11:41:09+00:00
  3054. International CGT in Japan, Day Three
    Combinatorial Game Theory 2026-03-22T14:49:00+00:00
  3055. Contextual Collapse
    BLDGBLOG 2026-03-22T16:59:13+00:00
  3056. Witness at the End of Time
    Wild Information 2026-03-22T17:45:30+00:00
  3057. Nerd Deep Housecleaning Update, Part 1
    The Bottom Feeder 2026-03-22T19:54:02+00:00
  3058. International CGT in Japan, Day Four
    Combinatorial Game Theory 2026-03-23T08:59:00+00:00
  3059. Contra Dances Should Avoid Saturdays
    Jeff Kaufman 2026-03-23T13:00:00+00:00
  3060. Differently free
    Escaping Flatland 2026-03-23T17:04:07+00:00
  3061. Sloppelgängers
    The Present Age 2026-03-23T19:21:55+00:00
  3062. Dollar Country 261: On A Highway Heading South
    Dollar Country Newsletter & Radio Show 2026-03-24T03:18:41+00:00
  3063. When a Coin Becomes a Message, A War That Could Reshape the Global Economy, & Iran’s Violent Message to Its Citizens
    Kareem Abdul-Jabbar 2026-03-24T10:02:47+00:00
  3064. A Spanish-Speaking Robot in my Pocket
    Jeff Kaufman 2026-03-24T13:00:00+00:00
  3065. A New DCSS Server for South America
    Dungeon Crawl Stone Soup 2026-03-25T00:30:50+00:00
  3066. One hundred weirdo emails
    daniel.haxx.se 2026-03-25T08:05:41+00:00
  3067. AI: Physics
    Mark Bernstein 2026-03-25T12:46:25+00:00
  3068. Label By Usable Volume
    Jeff Kaufman 2026-03-25T13:00:00+00:00
  3069. collapse: data.models.worlds. What role does technology play in the intensifying state of crisis shaping our world?
    We Make Money Not Art 2026-03-25T14:00:53+00:00
  3070. About your last post, where you wished you could code more to make an accurate regency name generator — just wondering if you’ve ever heard of perchance.org? It’s designed for making random generators with zero coding necessary, unless you want the generator to look pretty.
    Ye Olde News 2026-03-25T17:19:01+00:00
  3071. A selection of strange and cryptic personal ads from The New York Herald, 1850s-1870s. 18/?
    Ye Olde News 2026-03-25T18:00:39+00:00
  3072. New Martian Writing
    Idle Words 2026-03-26T07:25:00+00:00
  3073. Joe Liemandt: Alpha School and the Future of Education
    Farnam Street 2026-03-26T09:30:00+00:00
  3074. Don’t trust, verify
    daniel.haxx.se 2026-03-26T10:09:07+00:00
  3075. When “Single-Family” Isn’t Family-Friendly
    City of Yes 2026-03-26T13:03:26+00:00
  3076. A selection of strange and cryptic personal ads from The New York Herald, 1860s to 1870s. 17/?
    Ye Olde News 2026-03-26T18:30:17+00:00
  3077. The IOC's New Policy Isn't Really a Trans Story
    The Present Age 2026-03-26T21:55:21+00:00
  3078. untitled
    https://jennifermillsnews.tumblr.com/ 2026-03-27T12:32:52+00:00
  3079. untitled
    https://jennifermillsnews.tumblr.com/ 2026-03-27T12:36:53+00:00
  3080. Panda cotta
    Food is Stupid 2026-03-27T13:22:57+00:00
  3081. AI and the human voice
    gilest.org 2026-03-27T16:53:54+00:00
  3082. A selection of strange and cryptic personal ads from The New York Herald, 1860s to 1890s. 16/?
    Ye Olde News 2026-03-27T18:30:27+00:00
  3083. How Social Media Became the New Tobacco, The Promise We Broke, & When Public Health Goes Quiet
    Kareem Abdul-Jabbar 2026-03-28T10:01:48+00:00
  3084. Artemis II Is Not Safe to Fly
    Idle Words 2026-03-28T12:29:00+00:00
  3085. A selection of strange and cryptic personal ads from The New York Herald, 1860s to 1890s. 15/?
    Ye Olde News 2026-03-28T18:30:55+00:00
  3086. NYC Draft: Spiders, Turtles, Turtles
    Mediocre Magic 2026-03-28T19:28:00+00:00
  3087. A selection of strange and cryptic personal ads from The New York Herald, 1860s to 1890s. 14/?
    Ye Olde News 2026-03-29T18:30:48+00:00
  3088. A Guide to Common Regency-Era Nicknames
    Ye Olde News 2026-03-30T19:07:55+00:00
  3089. They Know What "Wrong Place, Wrong Time" Means
    The Present Age 2026-03-30T20:49:49+00:00
  3090. The Houthis Didn’t Suddenly Materialize, Where Accountability Goes to Die, Clowns Become Candidates, & History Has Receipts...Legal Spin Doesn’t.
    Kareem Abdul-Jabbar 2026-03-31T10:02:41+00:00
  3091. Photos from my time in Iran, 2017
    Res Obscura 2026-03-31T12:50:11+00:00
  3092. Minotaur Eyes
    Steelsnowflake 2026-03-31T13:15:06+00:00
  3093. Infinite midwit
    Experimental History 2026-03-31T16:05:29+00:00
  3094. Please my sneeze.
    Muppe 2026-03-31T17:59:27+00:00
  3095. Ponder This Challenge - April 2026 - The Unlabeled Clock
    IBM Ponder This 2026-03-31T22:00:00+00:00
  3096. You Don't Have To Be A Fool To Be A Fool.
    Discworld MUD Dev Blog 2026-04-01T05:48:08+00:00
  3097. The Mystery of Rennes-le-Château, Part 3: A Secret History
    The Digital Antiquarian 2026-04-01T07:05:28+00:00
  3098. Some reflections on Elena Conis’ lecture “Contextualising the Modern Era of Vaccination”
    Practical Ethics 2026-04-01T09:18:57+00:00
  3099. Days are enormous
    Escaping Flatland 2026-04-01T11:14:48+00:00
  3100. @rebellum who asked about this post - Question: what regions does this cover? You mention “the…
    Ye Olde News 2026-04-01T20:10:11+00:00
  3101. I just really hate the word “fandom”. It’s just a portmanteau of “fan” and “random”. It sounds like some desperate attempt to be quirky and different. Plus, the word “fanbase” already exists.
    Ye Olde News 2026-04-01T20:50:23+00:00
  3102. Epic Hero #2, Dungeon of Derojhen: Final Judgement
    Renga in Blue 2026-04-02T02:06:55+00:00
  3103. More, and More Extensive, Supply Chain Attacks
    Jeff Kaufman 2026-04-02T13:00:00+00:00
  3104. > 205: Something hopeful to show the world you hoped?
    Laura Olin 2026-04-02T13:05:55+00:00
  3105. Picture Perfect. Challenging dominant Western beauty standards
    We Make Money Not Art 2026-04-02T13:59:19+00:00
  3106. Remote Isn’t Working
    City of Yes 2026-04-02T15:43:55+00:00
  3107. Useful
    The Present Age 2026-04-02T20:46:31+00:00
  3108. Skull Cave: The Mystery of the Mazes
    Renga in Blue 2026-04-02T22:31:32+00:00
  3109. untitled
    https://jennifermillsnews.tumblr.com/ 2026-04-03T12:16:38+00:00
  3110. Reconsider Challenging Sessions at Weekends
    Jeff Kaufman 2026-04-03T13:00:00+00:00
  3111. Vegan ortolan
    Food is Stupid 2026-04-03T13:04:03+00:00
  3112. A sleep aid
    Interconnected 2026-04-03T17:14:00+00:00
  3113. Bondi Fired, Courage After Retirement, & A Library So Fancy "It" Forgot the Books
    Kareem Abdul-Jabbar 2026-04-04T10:01:34+00:00
  3114. Chicken-Free Egg Whites
    Jeff Kaufman 2026-04-04T13:00:00+00:00
  3115. Before I go: People like it when other people make things
    daverupert.com 2026-04-04T17:00:00+00:00
  3116. Ozempic dreams
    daverupert.com 2026-04-04T19:18:00+00:00
  3117. Songsoo Kim’s Rapini Doenjang Guk (Flowering Spring Greens Soup)
    Vittles 2026-04-05T08:11:51+00:00
  3118. Spring Soups: A Vittles Cooking Supplement
    Vittles 2026-04-05T08:21:52+00:00
  3119. Unsweetened Whipped Cream
    Jeff Kaufman 2026-04-05T13:00:00+00:00
  3120. Listening for love or lust.
    The Curiosity Cabinet 2026-04-06T01:50:16+00:00
  3121. The Phantom Ship / Yuureisen (1982)
    Renga in Blue 2026-04-06T02:49:41+00:00
  3122. Nomic Coding Game
    nklein software 2026-04-06T04:09:29+00:00
  3123. THE END OF THE PARTY
    Infinite Gossip 2026-04-06T07:52:35+00:00
  3124. Destruction of Infrastructure for the Impact on Civilians is Manifestly Illegal
    Jeff Kaufman 2026-04-06T13:00:00+00:00
  3125. Community Iftar
    Practical Ethics 2026-04-06T17:18:34+00:00
  3126. MISSING PERSON PLEASE SHARE
    Muppe 2026-04-06T23:31:13+00:00
  3127. Hospitality has a wage theft problem
    Vittles 2026-04-07T07:39:09+00:00
  3128. Doctor Breaks Silence on Trump’s Health, A Cuba Policy Built on Painm & A Championship Won the UCLA Way
    Kareem Abdul-Jabbar 2026-04-07T10:01:35+00:00
  3129. Contra Dance Piano Teaching Videos
    Jeff Kaufman 2026-04-07T13:00:00+00:00
  3130. Inverted themes with light-dark()
    daverupert.com 2026-04-07T15:31:00+00:00
  3131. Curiosity Rover’s damaged wheels after 13 years, or 7.25 Martian years of service on the Red Planet….
    STML 2026-04-08T19:08:43+00:00
  3132. Mario Harik: Playing to Win
    Farnam Street 2026-04-09T09:30:00+00:00
  3133. Arrested Development
    City of Yes 2026-04-09T14:00:53+00:00
  3134. Take Him Literally
    The Present Age 2026-04-09T15:43:02+00:00
  3135. Message From Space No Talking
    Fujichia 2026-04-09T22:15:54+00:00
  3136. Chaos hears the Call of the Ring (WUBRG Drafting)
    Mediocre Magic 2026-04-10T01:25:00+00:00
  3137. untitled
    https://jennifermillsnews.tumblr.com/ 2026-04-10T12:14:29+00:00
  3138. At Dalston’s Ridley Road Indoor Market, a Community Fights for Its Survival
    Vittles 2026-04-10T13:03:25+00:00
  3139. Baked alphabet
    Food is Stupid 2026-04-10T13:04:14+00:00
  3140. Seemingly in cahoots
    gilest.org 2026-04-10T13:13:03+00:00
  3141. mist is now open source and looking for interop
    Interconnected 2026-04-10T16:35:00+00:00
  3142. When Both Sides Declare Victory, Who Drops Nearly a Billion Before a Ceasefire, & Typhus Is Back in L.A.
    Kareem Abdul-Jabbar 2026-04-11T10:01:41+00:00
  3143. Iterating the potatoes
    gilest.org 2026-04-11T14:10:54+00:00
  3144. The Phantom Ship / Yuureisen: Mounds of Verbs
    Renga in Blue 2026-04-11T20:20:03+00:00
  3145. Sprouts 2026 Summaries
    Combinatorial Game Theory 2026-04-12T17:34:00+00:00
  3146. Sprouts 2026 Afterthoughts
    Combinatorial Game Theory 2026-04-12T18:18:00+00:00
  3147. Good news from Hungary
    Crooked Timber 2026-04-13T03:45:06+00:00
  3148. Combining Rate and Instructions to Create Beautiful Madness
    a blog by biggiemac42 2026-04-13T06:39:10+00:00
  3149. How Hurricane Melissa Affected Food and Farming in Jamaica
    Vittles 2026-04-13T07:45:38+00:00
  3150. Happy Birthday, Dorothy Lynch
    Midwesterner 2026-04-13T11:02:41+00:00
  3151. When moving fast, talking is the first thing to break
    daverupert.com 2026-04-13T15:10:00+00:00
  3152. Haiku Activity & Contract Report, March 2026 (ft. ARM64)
    Haiku Project 2026-04-14T02:00:00+00:00
  3153. How to walk through walls
    Escaping Flatland 2026-04-14T07:22:57+00:00
  3154. Pluto’s Hillary Mountains
    STML 2026-04-14T08:52:59+00:00
  3155. How many babies do we want? How many will we have?
    Crooked Timber 2026-04-14T09:17:32+00:00
  3156. A Two‑Week Sprint Into a Forty‑Year Problem, When Political Theater Meets a 2,000‑Year‑Old Institution, & The Real Reason Stress Keeps Winning
    Kareem Abdul-Jabbar 2026-04-14T10:02:14+00:00
  3157. History Nerd Bucket List: The Jenny Geddes Stool
    Crooked Timber 2026-04-14T11:06:13+00:00
  3158. The "Foremost War Skeptic"
    The Present Age 2026-04-14T13:44:44+00:00
  3159. Soil Turn—A Field Guide to Artistic Earthly Engagements
    We Make Money Not Art 2026-04-14T14:21:52+00:00
  3160. Nothing ever dies. It merely becomes embarrassing.
    Experimental History 2026-04-14T16:16:42+00:00
  3161. I Will Never Respect A Website
    Ed Zitron's Where's Your Ed At 2026-04-14T16:22:59+00:00
  3162. How to use your compost now that it's officially spring
    The Rot 2026-04-14T16:54:20+00:00
  3163. Ideas for Mickey
    Fujichia 2026-04-14T17:46:32+00:00
  3164. I’ve just spent about two hours reading through ALL your Rachel & Co. posts. Thank you for sharing all these wonderful letters!
    Ye Olde News 2026-04-14T19:45:30+00:00
  3165. The Phantom Ship / Yuureisen: Cursed Defiler
    Renga in Blue 2026-04-15T03:38:06+00:00
  3166. Music break: Baba Yetu
    Crooked Timber 2026-04-15T09:12:06+00:00
  3167. The Dorothy Lynch Red Beer
    Midwesterner 2026-04-15T11:01:16+00:00
  3168. The chewy, nutty snack from Isfahan
    Snack Stack 2026-04-15T12:17:00+00:00
  3169. ‘Once Queensway Market is gone, there won’t be anything like it left.’
    Vittles 2026-04-15T14:01:55+00:00
  3170. I don't want a screenshot of your Claude conversation
    daverupert.com 2026-04-15T15:17:00+00:00
  3171. A Real Delivery
    The Present Age 2026-04-15T16:42:42+00:00
  3172. This week in Rachel & Co. history…
    Ye Olde News 2026-04-15T20:09:56+00:00
  3173. EsoNatLangs Bring the Complexity of Natural Language into Code
    esoteric.codes 2026-04-16T05:22:00+00:00
  3174. EP-40 Riddim cheat sheet
    Spongefile 2026-04-16T18:00:23+00:00
  3175. Global science equity – towards solutions
    Crooked Timber 2026-04-17T07:38:21+00:00
  3176. Where to Eat Outside of London This Weekend
    Vittles 2026-04-17T09:18:36+00:00
  3177. Dorothy Lynch Everything
    Midwesterner 2026-04-17T11:03:41+00:00
  3178. The Work of Community
    City of Yes 2026-04-17T12:15:43+00:00
  3179. untitled
    https://jennifermillsnews.tumblr.com/ 2026-04-17T12:26:35+00:00
  3180. Limburger Bay Biscuits
    Food is Stupid 2026-04-17T13:03:16+00:00
  3181. The Mystery of Rennes-le-Château, Part 4: Non-Fiction Meets Fiction
    The Digital Antiquarian 2026-04-17T16:12:17+00:00
  3182. Premium: The Hater's Guide to Private Credit
    Ed Zitron's Where's Your Ed At 2026-04-17T16:57:30+00:00
  3183. Are those what my grand mother called leg-of-mutton sleeves?
    Ye Olde News 2026-04-17T17:36:35+00:00
  3184. The Palace Says It Cares, But Actions Tell a Different Story, Tax Day Shows the Gap, & Trump vs. The Pope Who Doesn’t Need His Approval
    Kareem Abdul-Jabbar 2026-04-18T10:01:31+00:00
  3185. Fifteen Years Aboard
    Jeff Kaufman 2026-04-18T13:00:00+00:00
  3186. Headless everything for personal AI
    Interconnected 2026-04-18T17:00:00+00:00
  3187. Bobby, I hardly Knew Ye
    Crooked Timber 2026-04-19T03:40:07+00:00
  3188. MixedHTML Mode for Emacs
    Jeff Kaufman 2026-04-19T13:00:00+00:00
  3189. Marry your boss?
    The Curiosity Cabinet 2026-04-19T13:19:40+00:00
  3190. Sunday photoblogging: Pézenas street
    Crooked Timber 2026-04-19T19:56:35+00:00
  3191. Eat This, Not That
    Vittles 2026-04-20T07:21:21+00:00
  3192. Archive Dive: Still Renting After All These Years
    The Deleted Scenes 2026-04-20T12:55:46+00:00
  3193. Exclusive: Microsoft To Shift GitHub Copilot Users To Token-Based Billing, Tighten Rate Limits
    Ed Zitron's Where's Your Ed At 2026-04-20T17:11:58+00:00
  3194. Dollar Country 262: To Tough To Die
    Dollar Country Newsletter & Radio Show 2026-04-20T17:17:58+00:00
  3195. Thank You For Being a Friend
    Coding Horror 2026-04-20T17:21:00+00:00
  3196. Occasional paper: Inconstant moon
    Crooked Timber 2026-04-20T21:46:33+00:00
  3197. OnionWars
    The Present Age 2026-04-20T22:30:52+00:00
  3198. Free Newsletter Tuesday: Midterm Panic, What's Up With the California Democratic Party, & The $8 Billion Machine Sprint
    Kareem Abdul-Jabbar 2026-04-21T10:02:58+00:00
  3199. AI AI Captain! Der Wienerschnitzel Edition
    The Deleted Scenes 2026-04-21T12:55:58+00:00
  3200. Automated Deanonymization is Here
    Jeff Kaufman 2026-04-21T13:00:00+00:00
  3201. Courier: real-time messaging for ESP32 with batteries included (new library)
    Interconnected 2026-04-21T15:25:00+00:00
  3202. Four Horsemen of the AIpocalypse
    Ed Zitron's Where's Your Ed At 2026-04-21T16:28:59+00:00
  3203. My Wikipedia Edits
    Fujichia 2026-04-21T17:32:06+00:00
  3204. 10,000-watt GPU meet 40-watt lump of meat
    daverupert.com 2026-04-21T19:36:00+00:00
  3205. [UPDATED] News: Anthropic (Briefly) Removes Claude Code From $20-A-Month "Pro" Subscription Plan For New Users
    Ed Zitron's Where's Your Ed At 2026-04-21T22:44:29+00:00
  3206. Chaos Clue Draft #2 (WUBRG Drafting)
    Mediocre Magic 2026-04-22T00:57:00+00:00
  3207. Greg Brockman: Inside the 72 Hours That Almost Killed OpenAI
    Farnam Street 2026-04-22T07:41:08+00:00
  3208. Nick Bramham’s Spanakorizo
    Vittles 2026-04-22T07:53:06+00:00
  3209. untitled
    STML 2026-04-22T10:14:48+00:00
  3210. High-Quality Chaos
    daniel.haxx.se 2026-04-22T11:44:40+00:00
  3211. Seeing Red
    The Deleted Scenes 2026-04-22T12:55:31+00:00
  3212. Your Supplies Probably Won't Be Stolen in a Disaster
    Jeff Kaufman 2026-04-22T13:00:00+00:00
  3213. The handmade beauty of Machine Age data visualizations
    Res Obscura 2026-04-22T13:05:46+00:00
  3214. The Scolding
    The Present Age 2026-04-22T16:18:08+00:00
  3215. [Updated] Exclusive: Microsoft Moving All GitHub Copilot Subscribers To Token-Based Billing In June
    Ed Zitron's Where's Your Ed At 2026-04-22T17:24:17+00:00
  3216. When The Rubber Meets The Road
    The Deleted Scenes 2026-04-23T12:55:53+00:00
  3217. Sanctuary Suburbs
    City of Yes 2026-04-23T13:01:35+00:00
  3218. On Reinforcing Cynicism in the Academy
    Crooked Timber 2026-04-24T07:43:41+00:00
  3219. Six Unexpectedly Exceptional Breakfasts
    Vittles 2026-04-24T10:31:47+00:00
  3220. untitled
    https://jennifermillsnews.tumblr.com/ 2026-04-24T12:24:39+00:00
  3221. New and Old #263
    The Deleted Scenes 2026-04-24T12:55:48+00:00
  3222. Contra Events Pairing Callers By Age?
    Jeff Kaufman 2026-04-24T13:00:00+00:00
  3223. Weird Dutch pizza
    Food is Stupid 2026-04-24T13:03:27+00:00
  3224. Premium: How OpenAI Kills Oracle
    Ed Zitron's Where's Your Ed At 2026-04-24T16:40:45+00:00
  3225. The “Freakout” and the “Abyss”
    The Present Age 2026-04-24T17:01:02+00:00
  3226. The Wind in the Willows and reading out loud
    Interconnected 2026-04-24T17:56:00+00:00
  3227. Dismissing Excellence in the Highest Court, The Brief Life of a MAGA Secretary, & Stagflation 2.0
    Kareem Abdul-Jabbar 2026-04-25T10:02:10+00:00
  3228. It took me far too long to realize that the final word was “blond” as in blonde lace, and Lady…
    Ye Olde News 2026-04-25T18:35:16+00:00
  3229. The Phantom Ship / Yuureisen: Say Amen
    Renga in Blue 2026-04-26T05:36:51+00:00
  3230. Chatty Chatty Change Change
    Discworld MUD Dev Blog 2026-04-26T06:20:58+00:00
  3231. Sunday photoblogging: l’Abbaye de Valmagne
    Crooked Timber 2026-04-26T07:41:01+00:00
  3232. Dot’s Home thoughts
    The Virtual Moose 2026-04-26T12:55:42+00:00
  3233. An excerpt from the trial of Elinor Crane, who was arrested in Middlesex in 1693 on suspicion of…
    Ye Olde News 2026-04-26T15:11:09+00:00
  3234. "A 50-year-old sociopath?"
    The Curiosity Cabinet 2026-04-26T18:41:54+00:00
  3235. Patrick: An Illustrated Essay
    Vittles 2026-04-27T08:01:07+00:00
  3236. Patrick
    Vittles 2026-04-27T08:06:18+00:00
  3237. Thoughts about making a career as a writer
    Escaping Flatland 2026-04-27T08:26:41+00:00
  3238. A museum about museums
    gilest.org 2026-04-27T11:16:41+00:00
  3239. First Train To Clarksburg?
    The Deleted Scenes 2026-04-27T12:55:50+00:00
  3240. Contra Binder on far-UVC and filtration
    Jeff Kaufman 2026-04-27T13:00:00+00:00
  3241. vigilance
    Weird Fucking Games 2026-04-27T16:11:55+00:00
  3242. A Bad Look
    The Present Age 2026-04-27T18:09:12+00:00
  3243. Allotment engineering
    gilest.org 2026-04-28T06:58:36+00:00
  3244. The Illusion of Security at the Washington Hilton, MTG Keeps Sounding the Alarm, & Vatican Dinners and Venture Capital
    Kareem Abdul-Jabbar 2026-04-28T10:03:26+00:00
  3245. Occasional paper: Blue Angels, Devil Hands
    Crooked Timber 2026-04-28T11:03:08+00:00
  3246. Friction and Reactionary Politics
    The Deleted Scenes 2026-04-28T12:26:11+00:00
  3247. Interview As Funeral Cone
    Fujichia 2026-04-28T15:46:29+00:00
  3248. AI's Economics Don't Make Sense [Ad Free]
    Ed Zitron's Where's Your Ed At 2026-04-28T16:33:46+00:00
  3249. AI's Economics Don't Make Sense
    Ed Zitron's Where's Your Ed At 2026-04-28T16:35:07+00:00
  3250. The 3rd Annual Blog Post Competition, Extravaganza, and Jamboree
    Experimental History 2026-04-28T17:48:50+00:00
  3251. OpenAI Projects ChatGPT Plus subscriptions to drop by 80% from 44 Million in 2025 to 9 Million In 2026, Made Up Using Cheaper Subscriptions (Somehow)
    Ed Zitron's Where's Your Ed At 2026-04-28T22:40:34+00:00
  3252. How I make a microbe shirt
    The Rot 2026-04-28T23:54:17+00:00
  3253. curl 8.20.0
    daniel.haxx.se 2026-04-29T06:27:01+00:00
  3254. With A Capital T That's Next To S Which Stands For Sky(scraper)
    The Deleted Scenes 2026-04-29T12:26:05+00:00
  3255. Are "Vintage LLMs" the start of a new humanistic field?
    Res Obscura 2026-04-29T12:45:56+00:00
  3256. Let Kids Keep More Productivity Gains
    Jeff Kaufman 2026-04-29T13:00:00+00:00
  3257. PS5’s latest DRM fiasco appears to be not as bad as first thought, but some official communication from Sony would be great
    PS5 – Destructoid 2026-04-29T15:29:11+00:00
  3258. A.P.E
    Weird Fucking Games 2026-04-29T16:11:55+00:00
  3259. We need RSS for sharing abundant vibe-coded apps
    Interconnected 2026-04-29T17:58:00+00:00
  3260. St. Andrew’s Adventure (1983)
    Renga in Blue 2026-04-29T20:23:52+00:00
  3261. Inspired
    daniel.haxx.se 2026-04-30T06:49:47+00:00
  3262. Approaching zero bugs?
    daniel.haxx.se 2026-04-30T08:08:34+00:00
  3263. Who Binds You?
    The Deleted Scenes 2026-04-30T12:55:58+00:00
  3264. Against In-Duct UV
    Jeff Kaufman 2026-04-30T13:00:00+00:00
  3265. The Plaza and the Parking Lot
    City of Yes 2026-04-30T13:02:54+00:00
  3266. yeoldenews: yeoldenews: yeoldenews: In April of 1896 Will...
    Ye Olde News 2026-04-30T17:18:10+00:00
  3267. Ponder This Challenge - May 2026 - The Powers of a Binary Matrix
    IBM Ponder This 2026-05-01T06:00:00+00:00
  3268. Haiku to mentor 3 students in Google Summer of Code 2026
    Haiku Project 2026-05-01T08:00:00+00:00
  3269. Beyond the Hype at London’s Newest Viral Sandwich Spot
    Vittles 2026-05-01T09:54:25+00:00
  3270. New and Old #264
    The Deleted Scenes 2026-05-01T12:55:49+00:00
  3271. Filthy soda
    Food is Stupid 2026-05-01T13:03:06+00:00
  3272. The First Amendment
    The Present Age 2026-05-01T14:57:39+00:00
  3273. Escape From Sparta (1983)
    Renga in Blue 2026-05-01T15:08:31+00:00
  3274. untitled
    https://jennifermillsnews.tumblr.com/ 2026-05-01T15:36:46+00:00
  3275. The Mystery of Rennes-le-Château, Part 5: The Man Behind the Curtain
    The Digital Antiquarian 2026-05-01T15:46:03+00:00
  3276. Easter Egg
    Weird Fucking Games 2026-05-01T16:49:06+00:00
  3277. Behind the Scenes of London's Most Influential Restaurant Group w/ Songsoo Kim
    Vittles 2026-05-02T09:32:56+00:00
  3278. More Than Just a Map, The Art of the Endless Distraction, & How Much Is a Dream Worth?
    Kareem Abdul-Jabbar 2026-05-02T10:01:14+00:00
  3279. A New DCSS Server for the US West Coast
    Dungeon Crawl Stone Soup 2026-05-02T19:25:21+00:00
  3280. Sunday photoblogging: Canigou and cherry trees
    Crooked Timber 2026-05-03T06:30:10+00:00
  3281. [GSoC 2026] Modernizing Haiku’s Bluetooth stack: Implementing support for HFP profile
    Haiku Project 2026-05-03T06:37:41+00:00
  3282. [GSoC 2026] Bluetooth: HCI Improvements & HID Profile | Haiku Project
    Haiku Project 2026-05-03T13:46:03+00:00
  3283. Text Adventures Still Rule in the Year 2026
    The Virtual Moose 2026-05-03T14:18:54+00:00
  3284. Horror House (1983)
    Renga in Blue 2026-05-03T15:17:45+00:00
  3285. PARA//LLAX
    Weird Fucking Games 2026-05-03T16:49:05+00:00
  3286. The duality of language models in the browser
    daverupert.com 2026-05-04T00:33:00+00:00
  3287. Comparisons as Predictable as the Sunrise
    The Pudding 2026-05-04T05:00:00+00:00
  3288. The history of London's squat cafes
    Vittles 2026-05-04T08:04:20+00:00
  3289. "Urbanist Sprawl" Revisited
    The Deleted Scenes 2026-05-04T12:55:58+00:00
  3290. Alarming Scheduling
    Jeff Kaufman 2026-05-04T13:00:00+00:00
  3291. Dollar Country 263: Alabama, Georgia, & Mississippi
    Dollar Country Newsletter & Radio Show 2026-05-04T14:03:08+00:00
  3292. Premium: The AI Compute Demand Story Is A Lie
    Ed Zitron's Where's Your Ed At 2026-05-04T14:09:22+00:00
  3293. Meandering Along the Alabama River
    FYFD 2026-05-04T15:00:00+00:00
  3294. PRISM: The T100 Version
    Renga in Blue 2026-05-04T20:24:24+00:00
  3295. Perfect Tides: Station to Station thoughts
    The Virtual Moose 2026-05-04T21:23:07+00:00
  3296. Retaliation is Not a Strategy, The Cienfuegos Ghost, & The Silence of NASA
    Kareem Abdul-Jabbar 2026-05-05T10:02:58+00:00
  3297. Ghost Of The Highways
    The Deleted Scenes 2026-05-05T12:55:26+00:00
  3298. Don’t Fall for the Tucker Carlson Apology Tour
    The Present Age 2026-05-05T13:26:49+00:00
  3299. Vibe Check №42
    daverupert.com 2026-05-05T13:41:00+00:00
  3300. Fluids Can Fracture
    FYFD 2026-05-05T15:00:00+00:00
  3301. When Should we Argue?
    Practical Ethics 2026-05-05T15:20:03+00:00
  3302. One Weird Trick
    Fujichia 2026-05-05T15:42:38+00:00
  3303. Made a Flickgame
    The Virtual Moose 2026-05-05T15:50:03+00:00
  3304. Same Game, Different Music
    Weird Fucking Games 2026-05-05T16:49:05+00:00
  3305. Theros: Face the Hydra (WUBRG Sealed)
    Mediocre Magic 2026-05-05T17:10:00+00:00
  3306. The Rise and Fall and Rise Again of the American Bald Eagle
    Steelsnowflake 2026-05-05T20:03:41+00:00
  3307. Warp Door's April 2026 Roundup
    Warp Door 2026-05-06T02:22:33+00:00
  3308. The world reveals itself to those who travel by foot
    Escaping Flatland 2026-05-06T09:10:51+00:00
  3309. Eight People, One Hob
    Vittles 2026-05-06T10:27:09+00:00
  3310. Buffet Chronicles: Eat Like A Mongol?
    The Deleted Scenes 2026-05-06T12:55:31+00:00
  3311. Plucking Droplets
    FYFD 2026-05-06T15:00:00+00:00
  3312. Am I Meant To Be Impressed?
    Ed Zitron's Where's Your Ed At 2026-05-06T15:13:07+00:00
  3313. hey, i love your rachel and co project. every few weeks i find myself coming back and rereading some of the posts. one thing i was wondering is how aunt gussie is related to everyone is she rachel’s dad’s sister?
    Ye Olde News 2026-05-06T19:13:30+00:00
  3314. My grand mother told me her mother told her of doing washing for rich ppl at a place called Sylvan Beach and ironing leg-of-mutton sleeves with solid metal sad irons.
    Ye Olde News 2026-05-06T19:20:20+00:00
  3315. Hopscotch (FeatureKreep)
    Warp Door 2026-05-07T00:20:32+00:00
  3316. Winston Weinberg: Speed, Stress, and Better Decisions
    Farnam Street 2026-05-07T09:55:00+00:00
  3317. untitled
    text-mode.org 2026-05-07T10:24:07+00:00
  3318. untitled
    text-mode.org 2026-05-07T10:42:33+00:00
  3319. untitled
    text-mode.org 2026-05-07T10:48:35+00:00
  3320. How to Ranch-Wash Anything
    Midwesterner 2026-05-07T11:01:00+00:00
  3321. The Arrival (Edward4hands)
    Warp Door 2026-05-07T11:30:13+00:00
  3322. The Curious Early D.C. Suburbs, Wheaton, Maryland Edition
    The Deleted Scenes 2026-05-07T12:55:55+00:00
  3323. The End of Urban Renewal
    City of Yes 2026-05-07T13:02:46+00:00
  3324. Inside an Ear
    FYFD 2026-05-07T15:00:00+00:00
  3325. Radio Galaxy
    Weird Fucking Games 2026-05-07T18:49:05+00:00
  3326. Chthosis (Mathias Waltz)
    Warp Door 2026-05-08T00:21:22+00:00
  3327. [GSoC 2026] Expanding the functionality of the Haiku Devices Application
    Haiku Project 2026-05-08T02:10:41+00:00
  3328. I want my MTV
    Interconnected 2026-05-08T02:51:00+00:00
  3329. VAPOR GALLERY (Liam Kenna)
    Warp Door 2026-05-08T03:21:44+00:00
  3330. Same Impala?
    Vittles 2026-05-08T09:32:52+00:00
  3331. untitled
    https://jennifermillsnews.tumblr.com/ 2026-05-08T12:40:29+00:00
  3332. New and Old #265
    The Deleted Scenes 2026-05-08T12:55:24+00:00
  3333. AI is Breaking Two Vulnerability Cultures
    Jeff Kaufman 2026-05-08T13:00:00+00:00
  3334. We put on a show
    gilest.org 2026-05-08T13:30:31+00:00
  3335. Uncertainty as a field of action. An interview with Amanda Masha Caminals
    We Make Money Not Art 2026-05-08T14:11:59+00:00
  3336. Premium: AI's Circular Psychosis
    Ed Zitron's Where's Your Ed At 2026-05-08T14:40:45+00:00
  3337. “Spiralling Textures”
    FYFD 2026-05-08T15:00:00+00:00
  3338. This Week on The Analog Antiquarian
    The Digital Antiquarian 2026-05-08T16:11:33+00:00
  3339. Nova Sonata
    Warp Door 2026-05-09T07:09:07+00:00
  3340. For us, by us
    gilest.org 2026-05-09T07:56:16+00:00
  3341. Minifold 01: First Fold (Pingfan Jie)
    Warp Door 2026-05-09T10:02:09+00:00
  3342. The Borrowed Future, The Infrastructure of Inequality & The Photo-Op Summit
    Kareem Abdul-Jabbar 2026-05-09T10:02:56+00:00
  3343. Somerville Porchfest 2026
    Jeff Kaufman 2026-05-09T13:00:00+00:00
  3344. Signal Garden
    Weird Fucking Games 2026-05-09T18:49:05+00:00
  3345. Store Draft became a Team vs Hordes Draft (WUBRG Drafting)
    Mediocre Magic 2026-05-09T19:36:00+00:00
  3346. Marges Destimbats (Crumbled Stone Walls) thoughts
    The Virtual Moose 2026-05-09T20:50:11+00:00
  3347. Sunday photoblogging: Pézenas, maison consulaire
    Crooked Timber 2026-05-10T07:47:39+00:00
  3348. Dual Bore Janko Venova
    Jeff Kaufman 2026-05-10T13:00:00+00:00
  3349. Blog Roundup (May 10, 2026)
    The Virtual Moose 2026-05-10T14:42:17+00:00
  3350. Cow (Demo) (ZDEsy)
    Warp Door 2026-05-10T14:45:34+00:00
  3351. Meanderware thoughts
    The Virtual Moose 2026-05-10T18:57:42+00:00
  3352. My Eyes Are Up Here
    Discworld MUD Dev Blog 2026-05-10T23:22:54+00:00
  3353. Mythos finds a curl vulnerability
    daniel.haxx.se 2026-05-11T06:01:35+00:00
  3354. From The People’s Bank to the Banker’s Bank
    Crooked Timber 2026-05-11T07:11:42+00:00
  3355. The Rise and Fall of Mercato Metropolitano
    Vittles 2026-05-11T07:27:14+00:00
  3356. places, i (droqen, Remi Marchand, Sakib Chowdhury)
    Warp Door 2026-05-11T08:21:04+00:00
  3357. For a Good Time, Call 347-1111
    Midwesterner 2026-05-11T11:12:35+00:00
  3358. Archive Dive: When I Say "City," You Say...
    The Deleted Scenes 2026-05-11T12:56:05+00:00
  3359. Loved the Incredibly Ambitious Interactive Fiction Game PARA//LLAX
    The Virtual Moose 2026-05-11T13:22:56+00:00
  3360. Liquid Pulleys and Gears
    FYFD 2026-05-11T15:00:00+00:00
  3361. Death And Taxes
    Discworld MUD Dev Blog 2026-05-11T18:40:02+00:00
  3362. Fish Bone
    Weird Fucking Games 2026-05-11T18:49:05+00:00
  3363. This Arcade Game Lets You Invade Iran as Trump
    The Present Age 2026-05-11T21:57:26+00:00
  3364. INLAND FROM SEAWORLD
    Infinite Gossip 2026-05-12T02:41:32+00:00
  3365. Haiku Activity & Contract Report, April 2026
    Haiku Project 2026-05-12T03:30:00+00:00
  3366. Poingle! (Demo) (SlappyHappy2000)
    Warp Door 2026-05-12T08:44:31+00:00
  3367. The UFO Files, The Hollywood Reset, & Don't Buy the Gold Card Hype
    Kareem Abdul-Jabbar 2026-05-12T10:03:30+00:00
  3368. Date with a T-Rex <3 (rabbytt)
    Warp Door 2026-05-12T11:40:44+00:00
  3369. Red Hot And Green
    The Deleted Scenes 2026-05-12T12:56:13+00:00
  3370. The text is not the product
    Crooked Timber 2026-05-12T13:19:45+00:00
  3371. Shame them, shun them, ban them, beat them!
    Experimental History 2026-05-12T13:23:30+00:00
  3372. Made Another Flickgame
    The Virtual Moose 2026-05-12T13:26:01+00:00
  3373. Jets From Impact
    FYFD 2026-05-12T15:00:00+00:00
  3374. Where Are All The Data Centers?
    Ed Zitron's Where's Your Ed At 2026-05-12T16:17:30+00:00
  3375. Stop Using Experimental Art As A Cudgel
    The Virtual Moose 2026-05-12T23:00:17+00:00
  3376. Showstopper, Centre Piece
    Vittles 2026-05-13T07:45:07+00:00
  3377. Thanassis Stavrakis, A man carrying a sheep on a motorcycle during a wildfire in Patras, western&hellip;
    STML 2026-05-13T10:57:56+00:00
  3378. Montgomery Inn Forever
    Midwesterner 2026-05-13T11:03:24+00:00
  3379. Iron Maiden T-Shirt With Ice Cream
    Fujichia 2026-05-13T11:40:36+00:00
  3380. 05/13/2026
    Dwarf Fortress Development Log 2026-05-13T12:00:00+00:00
  3381. 2026-05-13: DF 53.13 Released
    Dwarf Fortress Development Log 2026-05-13T12:00:00+00:00
  3382. Is "Good Friction" A Bad Idea?
    The Deleted Scenes 2026-05-13T12:55:50+00:00
  3383. How the “Impossible Torpedo” Worked
    FYFD 2026-05-13T15:00:00+00:00
  3384. The Coal Room
    Weird Fucking Games 2026-05-13T18:49:05+00:00
  3385. Nobody Asked for This Washington Post Podcast
    The Present Age 2026-05-13T20:33:58+00:00
  3386. &ldquo;Doesn&rsquo;t 23 seem awfully old for a girl to be. Last night my 23 candles looked like immeasurable&hellip;
    Ye Olde News 2026-05-13T21:12:54+00:00
  3387. untitled
    HORSEPUSSY GALORE 2026-05-14T01:05:14+00:00
  3388. untitled
    HORSEPUSSY GALORE 2026-05-14T01:06:24+00:00
  3389. untitled
    HORSEPUSSY GALORE 2026-05-14T01:08:07+00:00
  3390. Straight men w nice butts are like sick shame and twisted sin wrapped up and i need to hit!
    HORSEPUSSY GALORE 2026-05-14T01:13:53+00:00
  3391. You’ve inspired me to bare my soul…and my pussy!
    HORSEPUSSY GALORE 2026-05-14T01:19:01+00:00
  3392. untitled
    HORSEPUSSY GALORE 2026-05-14T01:21:32+00:00
  3393. untitled
    HORSEPUSSY GALORE 2026-05-14T01:52:50+00:00
  3394. untitled
    HORSEPUSSY GALORE 2026-05-14T02:02:31+00:00
  3395. Nail 'Em (Harold Krell)
    Warp Door 2026-05-14T02:16:29+00:00
  3396. Sort Sol f. Lydia Lunch - Boy/Girl
    HORSEPUSSY GALORE 2026-05-14T02:17:39+00:00
  3397. untitled
    HORSEPUSSY GALORE 2026-05-14T02:18:20+00:00
  3398. untitled
    HORSEPUSSY GALORE 2026-05-14T02:24:57+00:00
  3399. [Outliers] Chung Ju-yung: The Hyundai Founder Who Put a Country on His Back
    Farnam Street 2026-05-14T09:50:00+00:00
  3400. Weather (Nass Reda-Fathmi)
    Warp Door 2026-05-14T10:29:08+00:00
  3401. In 1997, local television in Kharkiv accidentally filmed one of the most iconic rave moments in&hellip;
    HORSEPUSSY GALORE 2026-05-14T11:20:25+00:00
  3402. untitled
    HORSEPUSSY GALORE 2026-05-14T11:59:52+00:00
  3403. Eraserhead baby makes waffles for you!
    HORSEPUSSY GALORE 2026-05-14T12:07:34+00:00
  3404. untitled
    HORSEPUSSY GALORE 2026-05-14T12:24:01+00:00
  3405. untitled
    HORSEPUSSY GALORE 2026-05-14T12:27:02+00:00
  3406. "Two Wheels Good" Semi-Review
    The Deleted Scenes 2026-05-14T12:55:16+00:00
  3407. The Public Square Is Not Online
    City of Yes 2026-05-14T13:02:55+00:00
  3408. Why they stopped building wooden stupas
    Res Obscura 2026-05-14T13:23:31+00:00
  3409. > 206: But why the last? I ask.
    Laura Olin 2026-05-14T14:09:26+00:00
  3410. Seeing Stress in an Avalanche
    FYFD 2026-05-14T15:00:00+00:00
  3411. Is this a sex thing? It feels like a sex thing.
    Muppe 2026-05-14T16:28:39+00:00
  3412. Two-Player Sealed vs Minotaur Horde and Xenagos Revel (WUBRG Sealed)
    Mediocre Magic 2026-05-14T16:51:34+00:00
  3413. Keep your shorthand to yourself
    Muppe 2026-05-14T16:51:51+00:00
  3414. Creation and Invention Are Games We All Play
    The Bottom Feeder 2026-05-14T17:37:10+00:00
  3415. That firefighting game I played in Toronto
    Zarf Updates
  3416. Those ZIL grammar flags
    Zarf Updates
  3417. Spring games of the id
    Zarf Updates
  3418. Visible Zork 3 is now available to all
    Zarf Updates
  3419. 2026 Hugo Award finalists
    Zarf Updates
  3420. A bunch of games with nothing in common
    Zarf Updates
  3421. A Cornerstone interpreter and the mu machine
    Zarf Updates
  3422. The Curse of the Forgotten Adverbs
    Zarf Updates
  3423. Ludic Narrans
    Zarf Updates
  3424. GDC: gloom and haruspicy
    Zarf Updates
  3425. Visible Zorker: March status report
    Zarf Updates
  3426. Twine and Zork at GDC
    Zarf Updates
  3427. The Game Narrative Kaleidoscope
    Zarf Updates
  3428. 1989 in context
    Zarf Updates
  3429. Visible Zorker: status report
    Zarf Updates
  3430. GDC plans, 2026
    Zarf Updates
  3431. When is a bug not a bug?
    Zarf Updates
  3432. To fight a troll
    Zarf Updates
  3433. The Beacon is lit
    Zarf Updates
  3434. Chronological order
    Zarf Updates
  3435. The Visible Zorker Project (and Patreon)
    Zarf Updates
  3436. 2026 IGF nominees
    Zarf Updates
  3437. The Visible Zorker 2
    Zarf Updates
  3438. NarraScope is open for submissions
    Zarf Updates
  3439. Adorable little games that you should just go play
    Zarf Updates
  3440. Moving away from Tailwind, and learning to structure my CSS
    Julia Evans
  3441. Links to CSS colour palettes
    Julia Evans
  3442. Testing Vue components in the browser
    Julia Evans
  3443. Examples for the tcpdump and dig man pages
    Julia Evans
  3444. Notes on clarifying man pages
    Julia Evans
  3445. Some notes on starting to use Django
    Julia Evans
  3446. A data model for Git (and other docs updates)
    Julia Evans
  3447. Notes on switching to Helix from vim
    Julia Evans
  3448. New zine: The Secret Rules of the Terminal
    Julia Evans
  3449. Using `make` to compile C programs (for non-C-programmers)
    Julia Evans
  3450. Standards for ANSI escape codes
    Julia Evans
  3451. How to add a directory to your PATH
    Julia Evans
  3452. Some terminal frustrations
    Julia Evans
  3453. What's involved in getting a "modern" terminal setup?
    Julia Evans
  3454. "Rules" that terminal programs follow
    Julia Evans
  3455. Why pipes sometimes get "stuck": buffering
    Julia Evans
  3456. Importing a frontend Javascript library without a build system
    Julia Evans
  3457. New microblog with TILs
    Julia Evans
  3458. ASCII control characters in my terminal
    Julia Evans
  3459. Using less memory to look up IP addresses in Mess With DNS
    Julia Evans
  3460. The agent principal-agent problem
    David Crawshaw
  3461. I am building a cloud
    David Crawshaw
  3462. Eight more months of agents
    David Crawshaw
  3463. How I program with Agents
    David Crawshaw
  3464. How I program with LLMs
    David Crawshaw
  3465. jsonfile: a quick hack for tinkering
    David Crawshaw
  3466. new year, same plan
    David Crawshaw
  3467. log4j: between a rock and a hard place
    David Crawshaw
  3468. Software I’m thankful for
    David Crawshaw
  3469. Remembering the LAN
    David Crawshaw
  3470. The asymmetry of Internet identity
    David Crawshaw
  3471. Zero Trust Networks
    David Crawshaw
  3472. Go 1.13: xerrors
    David Crawshaw
  3473. Fast compilers for fast programs
    David Crawshaw
  3474. UTF-7: a ghost from the time before UTF-8
    David Crawshaw
  3475. One process programming notes (with Go and SQLite)
    David Crawshaw
  3476. Reasoning with Regret
    David Crawshaw
  3477. Searching the Creative Internet
    David Crawshaw
  3478. Service Throughput Tradeoffs
    David Crawshaw
  3479. Sharp-Edged Finalizers in Go
    David Crawshaw
  3480. The Tragedy of Finalizers
    David Crawshaw
  3481. Go and SQLite: when database/sql chafes
    David Crawshaw
  3482. Experimentation Adrift
    David Crawshaw
  3483. Leaving Google
    David Crawshaw
  3484. Less cgo overhead in Go 1.8
    David Crawshaw
  3485. BBR
    David Crawshaw
  3486. Compiler Bomb
    David Crawshaw
  3487. On recieving the News
    David Crawshaw
  3488. Buried by the media
    David Crawshaw
  3489. Smaller Go 1.7 binaries
    David Crawshaw
  3490. Good business
    David Crawshaw
  3491. Everyone a writer
    David Crawshaw
  3492. 2016-06-29
    David Crawshaw
  3493. Transaction oriented collector
    David Crawshaw
  3494. Machining under a microscope
    David Crawshaw
  3495. Limits of Superintelligence
    David Crawshaw
  3496. COPY Relocations
    David Crawshaw
  3497. Atom Feed
    David Crawshaw
  3498. 2016-02-10
    David Crawshaw
  3499. 2016-01-23
    David Crawshaw
  3500. 2016-01-18
    David Crawshaw
  3501. 2016-01-15
    David Crawshaw
  3502. 2016-01-09
    David Crawshaw
  3503. 2016-01-07
    David Crawshaw
  3504. 2016-01-05
    David Crawshaw
  3505. 2016-01-04
    David Crawshaw
  3506. 2016-01-03
    David Crawshaw
  3507. 2016-01-02
    David Crawshaw
  3508. 2016-01-01
    David Crawshaw
  3509. 2015-12-29
    David Crawshaw
  3510. Under the heel of the spirit
    David Crawshaw
  3511. 2015-12-27
    David Crawshaw
  3512. 2015-12-26
    David Crawshaw
  3513. 2015-12-20
    David Crawshaw
  3514. 2015-12-15
    David Crawshaw
  3515. 2015-12-04
    David Crawshaw
  3516. 2015-11-18
    David Crawshaw
  3517. 2015-11-16
    David Crawshaw
  3518. 2015-10-13
    David Crawshaw
  3519. 2015-08-07
    David Crawshaw
  3520. 2015-08-04
    David Crawshaw
  3521. 2015-07-27
    David Crawshaw
  3522. 2015-07-17
    David Crawshaw
  3523. 2015-07-15
    David Crawshaw
  3524. 2015-07-14
    David Crawshaw
  3525. 2015-07-07
    David Crawshaw
  3526. 2015-06-26
    David Crawshaw
  3527. 2015-06-24
    David Crawshaw
  3528. 2015-06-22
    David Crawshaw
  3529. 2015-06-01
    David Crawshaw
  3530. 2015-05-08
    David Crawshaw
  3531. 2015-05-07
    David Crawshaw
  3532. 2015-04-02
    David Crawshaw
  3533. 2015-03-10
    David Crawshaw
  3534. 2015-03-09
    David Crawshaw
  3535. 2015-03-01
    David Crawshaw
  3536. 2015-01-11
    David Crawshaw
  3537. 2015-01-10
    David Crawshaw
  3538. 2014-12-11
    David Crawshaw
  3539. 2014-07-28
    David Crawshaw
  3540. 2014-06-13
    David Crawshaw
  3541. 2014-05-14
    David Crawshaw
  3542. 2014-05-06
    David Crawshaw
  3543. 2014-04-18
    David Crawshaw
  3544. 2014-03-08
    David Crawshaw
  3545. 2014-01-17
    David Crawshaw
  3546. SyncMaster of the Universe
    Leaded Solder
  3547. Loonies for Loongsons
    Leaded Solder
  3548. Make Your Own ColecoVision At Home (Part 5 - Making More)
    Leaded Solder
  3549. Untrashing a TRS-80
    Leaded Solder
  3550. Leaded Solder vs. The Crazy 77
    Leaded Solder
  3551. Controlling the Wizzard
    Leaded Solder
  3552. Giving the SPARCstation some jumper cables
    Leaded Solder
  3553. Commodore 64 black screen failure round-up
    Leaded Solder
  3554. You’re Out of Timer
    Leaded Solder
  3555. Three Times the Fun
    Leaded Solder
  3556. Simple gpx export from ridewithgps
    Dima Kogan
  3557. mrcal 2.5 released!
    Dima Kogan
  3558. Meshroom packaged for Debian
    Dima Kogan
  3559. Using libpython3 without linking it in; and old Python, g++ compatibility patches
    Dima Kogan
  3560. Eigen macro specializations crashes
    Dima Kogan
  3561. Getting precise timings out of RS-232 output
    Dima Kogan
  3562. Shop scheduling with PuLP
    Dima Kogan
  3563. When are the days getting longer the fastest?
    Dima Kogan
  3564. Strava track filtering validation
    Dima Kogan
  3565. GNU Make: details regarding intermediate files
    Dima Kogan
  3566. Speeding up JavaScript function with AI help
    Krzysztof Kowalczyk blog
  3567. How to run msvc cl.exe from command-line (powershell)
    Krzysztof Kowalczyk blog
  3568. Novel login system for web apps
    Krzysztof Kowalczyk blog
  3569. Benchmarking JSON vs TOON in Go
    Krzysztof Kowalczyk blog
  3570. From JSON to TOON
    Krzysztof Kowalczyk blog
  3571. Fixing Zed's debugger keybindings
    Krzysztof Kowalczyk blog
  3572. Ideas for faster web dev cycle
    Krzysztof Kowalczyk blog
  3573. Zed debug setup for go server / Svelte web app
    Krzysztof Kowalczyk blog
  3574. Stage manager in Mac OS
    Krzysztof Kowalczyk blog
  3575. AltTab for Mac OS
    Krzysztof Kowalczyk blog
  3576. lazy import of JavaScript modules
    Krzysztof Kowalczyk blog
  3577. Using await in Svelte 5 components
    Krzysztof Kowalczyk blog
  3578. vite /rollup manualChunks
    Krzysztof Kowalczyk blog
  3579. Increase software sales by 50% or more
    Krzysztof Kowalczyk blog
  3580. File sync is very slow
    Krzysztof Kowalczyk blog
  3581. New Edna feature: multiple notes
    Krzysztof Kowalczyk blog
  3582. Evolving Edna Ask AI UI
    Krzysztof Kowalczyk blog
  3583. Desktop UI frameworks written by a single person
    Krzysztof Kowalczyk blog
  3584. Implementing UI translation in SumatraPDF, a C++ Windows application
    Krzysztof Kowalczyk blog
  3585. Calling Grok, OpenAI, Anthropic, Google, OpenRouter API from the browser
    Krzysztof Kowalczyk blog
  3586. Case study of over-engineered C++ code
    Krzysztof Kowalczyk blog
  3587. Increase open file limit on Ubuntu Linux
    Krzysztof Kowalczyk blog
  3588. Explaining nil interface{} gotcha in Go
    Krzysztof Kowalczyk blog
  3589. Size textarea to content
    Krzysztof Kowalczyk blog
  3590. All about Svelte 5 snippets
    Krzysztof Kowalczyk blog
  3591. Don't Use aidev-mode
    Language Agnostic
  3592. Arbitrary Update 0leinzfmdpg
    Language Agnostic
  3593. 3D Printing Field Report
    Language Agnostic
  3594. AI Multipliers
    Language Agnostic
  3595. Light and Spin
    Language Agnostic
  3596. Arbitrary Update 9999
    Language Agnostic
  3597. Not Dead Yet
    Language Agnostic
  3598. Models In The Wild
    Language Agnostic
  3599. The Scratchpad Talk
    Language Agnostic
  3600. Chop and aidev
    Language Agnostic
  3601. TASM Notes, January 9th, 2025
    Language Agnostic
  3602. Making LLMs Do What You Want to your Files
    Language Agnostic
  3603. Making LLMs Do What You Want - Interlude
    Language Agnostic
  3604. Making LLMs Do More of What You Want
    Language Agnostic
  3605. Making LLMs Do What You Want
    Language Agnostic
  3606. GCP is Bullshit and Here's Why
    Language Agnostic
  3607. Antler - Elegy
    Language Agnostic
  3608. TASM Notes, May 23rd, 2024
    Language Agnostic
  3609. TASM Notes, May 16th, 2024
    Language Agnostic
  3610. TASM Notes, May 5th 2024
    Language Agnostic
  3611. esbuild can build css
    Julia Evans: TIL
  3612. Al Sweigart's Python books are available for free
    Julia Evans: TIL
  3613. Resources for upgrading Django
    Julia Evans: TIL
  3614. You don't have to close <p> or <li> tags
    Julia Evans: TIL
  3615. Advice for writing alt text
    Julia Evans: TIL
  3616. fx: a jq replacement
    Julia Evans: TIL
  3617. CSS supports nested selectors now!
    Julia Evans: TIL
  3618. You can use `fzf` to review git commits
    Julia Evans: TIL
  3619. strace has a --stack-traces option
    Julia Evans: TIL
  3620. In CSS you can populate `content:` with a `data-` attribute
    Julia Evans: TIL
  3621. Environment variables with no equals sign
    Julia Evans: TIL
  3622. Two ways the mouse wheel works in the terminal
    Julia Evans: TIL
  3623. You can run `tty` to see your current TTY
    Julia Evans: TIL
  3624. strace's `--tips`
    Julia Evans: TIL
  3625. Tiny IP-KVM devices exist
    Julia Evans: TIL
  3626. Emoji Kitchen
    Julia Evans: TIL
  3627. pip install --user can override system libraries
    Julia Evans: TIL
  3628. why the text disappers from my PDF when I print it
    Julia Evans: TIL
  3629. `**` works for globbing in the shell
    Julia Evans: TIL
  3630. Some programming languages buffer stdout and some don't
    Julia Evans: TIL
  3631. Is Zig's New Writer Unsafe?
    openmymind.net
  3632. Everything is a []u8
    openmymind.net
  3633. I'm too dumb for Zig's new IO interface
    openmymind.net
  3634. Zig's new Writer
    openmymind.net
  3635. Zig's new LinkedList API (it's time to learn @fieldParentPtr)
    openmymind.net
  3636. Allocator.resize
    openmymind.net
  3637. ArenaAllocator.free and Nested Arenas
    openmymind.net
  3638. Zig's dot star syntax (value.*)
    openmymind.net
  3639. GetOrPut With String Keys
    openmymind.net
  3640. Comparing Strings as Integers with @bitCast
    openmymind.net
  3641. Switching on Strings in Zig
    openmymind.net
  3642. Using Generics to Inject Stubs when Testing
    openmymind.net
  3643. In Zig, What's a Writer?
    openmymind.net
  3644. Using SIMD to Tell if the High Bit is Set
    openmymind.net
  3645. Peeking Behind Zig Interfaces by Creating a Dummy std.Random Implementation
    openmymind.net
  3646. Comptime as Configuration
    openmymind.net
  3647. Zig's @bitCast
    openmymind.net
  3648. Basic Awareness in Addition to Deep Understanding
    openmymind.net
  3649. Sorting Strings in Zig
    openmymind.net
  3650. Gluing JSON
    openmymind.net
  3651. Functional Classes in Clojure
    The Clean Code Blog
  3652. Functional Classes
    The Clean Code Blog
  3653. Space War
    The Clean Code Blog
  3654. Functional Duplications
    The Clean Code Blog
  3655. Roots
    The Clean Code Blog
  3656. More On Types
    The Clean Code Blog
  3657. On Types
    The Clean Code Blog
  3658. if-else-switch
    The Clean Code Blog
  3659. Pairing Guidelines
    The Clean Code Blog
  3660. Solid Relevance
    The Clean Code Blog
  3661. Loopy
    The Clean Code Blog
  3662. Conference Conduct
    The Clean Code Blog
  3663. The Disinvitation
    The Clean Code Blog
  3664. REPL Driven Design
    The Clean Code Blog
  3665. A Little More Clojure
    The Clean Code Blog
  3666. A Little Clojure
    The Clean Code Blog
  3667. A New Hope
    The Clean Code Blog
  3668. Open Letter to the Linux Foundation
    The Clean Code Blog
  3669. What They Thought of Programmers.
    The Clean Code Blog
  3670. Circulatory
    The Clean Code Blog
  3671. MUD Day Postponed to 20 June
    The CRPG Addict
  3672. Upcoming Games: Al-Qadim (1994), The Odyssey (1993), Escape from Ragor (1994), Dungeon Arcade (1987), Pagan: Ultima VIII (1994), Warriors and Warlocks (1983), Ravenloft: Strahd's Possession (1994)
    The CRPG Addict
  3673. Game 577: Yendorian Tales: Book I
    The CRPG Addict
  3674. Nobunaga's Ambition: BASIC bushido
    Data Driven Gamer
  3675. Nobunaga's Ambition: Won!
    Data Driven Gamer
  3676. Game 470: Nobunaga's Ambition
    Data Driven Gamer
  3677. Game 469: Battle of Kawanakajima
    Data Driven Gamer
  3678. Paradroid: Won!
    Data Driven Gamer
  3679. Game 468: Paradroid
    Data Driven Gamer
  3680. Games 465-467: Hewson Consultants and the 3D Seiddab trilogy
    Data Driven Gamer
  3681. Superauthenticity: Arcade game aspect ratios
    Data Driven Gamer
  3682. Game 464: Gun.Smoke
    Data Driven Gamer
  3683. Xanadu: Won!
    Data Driven Gamer
  3684. Xanadu: How to train your dragon slayer
    Data Driven Gamer
  3685. Xanadu: Tickling the dragon
    Data Driven Gamer
  3686. Xanadu: Sea of squares
    Data Driven Gamer
  3687. Xanadu: Anxious powergaming
    Data Driven Gamer
  3688. Xanadu: Full plate and packing steel
    Data Driven Gamer
  3689. Xanadu: Honey tongue, butter fingers
    Data Driven Gamer
  3690. Xanadu: Pick poor Robin clean
    Data Driven Gamer
  3691. Xanadu: I expect you to buy
    Data Driven Gamer
  3692. Xanadu: Magic
    Data Driven Gamer
  3693. Game 463: Xanadu: Dragon Slayer II
    Data Driven Gamer
  3694. Silent Service: Tang & final rating
    Data Driven Gamer
  3695. Silent Service: Seawolf
    Data Driven Gamer
  3696. Game 462: Silent Service
    Data Driven Gamer
  3697. ST Pawn
    Data Driven Gamer
  3698. The Pawn: Won!
    Data Driven Gamer
  3699. Nintendo Promotional Toys: Kanebo's Dash Rider (ダッシュライダー) from the 1970s
    beforemario
  3700. Nintendo produced 1960s promotional card set
    beforemario
  3701. Nintendo Home Race (ホームレース, ca 1966)
    beforemario
  3702. Nintendo Playing Cards catalogue from 2001
    beforemario
  3703. A Nintendo Pilgrimage [part 7 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3704. A Nintendo Pilgrimage [part 6 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3705. A Nintendo Pilgrimage [part 5 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3706. A Nintendo Pilgrimage [part 4 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3707. A Nintendo Pilgrimage [part 3 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3708. A Nintendo Pilgrimage [part 2 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3709. Nintendo playing cards featuring Marilyn Monroe
    beforemario
  3710. A Nintendo Pilgrimage [part 1 of 7]: My Unforgettable Week in Kyoto
    beforemario
  3711. From Cards to Condiments: Nintendo’s Ads in a Disney Booklet from the 1960s
    beforemario
  3712. Ten years of Before Mario book memories
    beforemario
  3713. Meet the Collectors - #13 - Elijah Luttmann
    beforemario
  3714. Nintendo ad in 1960s Playboy magazine
    beforemario
  3715. The Nintendo Museum: my first impressions
    beforemario
  3716. A Treasure in Kyoto: Rediscovering Nintendo’s First Ad from 1894
    beforemario
  3717. Nintendo Museum's 2024 Ultra Hand Remake: honors and improves the original
    beforemario
  3718. Nintendo Poitan Game, a water toy lost in time (ポイタン ゲーム, 1966)
    beforemario
  3719. Nintendo toys in 1977 Kiddy Land catalogue
    beforemario
  3720. The Project Odyssée team visits Before Mario
    beforemario
  3721. Nintendo Patriotic Cards from 1942 and 1943 (Aikoku Hyakunin Isshu / 愛國百人一首)
    beforemario
  3722. Spot the difference: Ultra(s)cope box variants
    beforemario
  3723. Nintendo Love Peace "Smiley" e-clock (Love Peace 電気時計, circa 1971)
    beforemario
  3724. Announcing RimWorld World
    Ludeon Studios
  3725. The winter merch collection is here! ❄️
    Ludeon Studios
  3726. Holiday trade caravans are on their way!
    Ludeon Studios
  3727. Update 1.6.4630 released
    Ludeon Studios
  3728. Bring home a thrumbo and boomalope!
    Ludeon Studios
  3729. Announcing the thrumbo figure and boomalope night light!
    Ludeon Studios
  3730. Update 1.6.4566 improves gravships, shuttles, and more
    Ludeon Studios
  3731. Update 1.6.4543 released
    Ludeon Studios
  3732. Update 1.6.4535 released
    Ludeon Studios
  3733. Update 1.6.4528 released
    Ludeon Studios
  3734. Problems with video recreations of classic pinball
    @Play Collected
  3735. How to Get Started Playing Mystery Dungeon
    @Play Collected
  3736. @Play 87: Interview with Josh Ge, Creator of Cogmind
    @Play Collected
  3737. @Play 86: Interview with Dr. Thomas Biskup, Creator of ADOM
    @Play Collected
  3738. Slashware's game Ananias releases on Steam
    @Play Collected
  3739. Zelda Randomizer set to stream at 2 PM Eastern
    @Play Collected
  3740. Stuff concerning @Play, Zelda Randomizer and other things
    @Play Collected
  3741. Roguelike Celebration, Notes on My Talk
    @Play Collected
  3742. Something called the Casino Dungeon
    @Play Collected
  3743. Progress on 86
    @Play Collected
  3744. @Play 85: A Talk with Digital Eel, Makers of the Infinite Space Games
    @Play Collected
  3745. 7DRL Home Stretch!
    @Play Collected
  3746. @Play 84: The Rescue of Meta-Zelda
    @Play Collected
  3747. Update: next column, StoryBundle results
    @Play Collected
  3748. @Play 83: HyperRogue
    @Play Collected
  3749. The book is out! "@Play: Exploring Roguelike Games"
    @Play Collected
  3750. Nethack 3.6 is out!
    @Play Collected
  3751. Not done yet
    @Play Collected
  3752. EXTRA: Satoru Iwata knew what roguelikes are
    @Play Collected
  3753. @Play 82: The Talks of the International Roguelike Developers Conference US, 2015
    @Play Collected
  3754. EXTRA: Junethack
    @Play Collected
  3755. Upcoming: @Play 82 on IRDC US 2015
    @Play Collected
  3756. International Roguelike Developers Conference, Atlanta GA
    @Play Collected
  3757. EXTRA: Roguelike Radio celebrates 100 episodes!
    @Play Collected
  3758. EXTRA: Bay12 Games (of Dwarf Fortress) has a Patreon
    @Play Collected
  3759. Graphics Studies Compilation
    Adrian Courrèges
  3760. UE4 Optimized Post-Effects
    Adrian Courrèges
  3761. Metal Gear Solid V - Graphics Study
    Adrian Courrèges
  3762. Beware of Transparent Pixels
    Adrian Courrèges
  3763. DOOM (2016) - Graphics Study
    Adrian Courrèges
  3764. GTA V - Graphics Study - Part 3
    Adrian Courrèges
  3765. GTA V - Graphics Study - Part 2
    Adrian Courrèges
  3766. GTA V - Graphics Study
    Adrian Courrèges
  3767. Print Copy of SupCom Graphics Study
    Adrian Courrèges
  3768. Exp3D Goes Open-Source
    Adrian Courrèges
  3769. Supreme Commander - Graphics Study
    Adrian Courrèges
  3770. Introducing Linux Visual Novel Reader
    Adrian Courrèges
  3771. Deus Ex: Human Revolution - Graphics Study
    Adrian Courrèges
  3772. Customizing IRKit Firmware: LED and Offline Mode
    Adrian Courrèges
  3773. Introducing IRKit Web Remote
    Adrian Courrèges
  3774. IRKit Setup Guide for Android, iOS, Linux, Mac, Windows
    Adrian Courrèges
  3775. Beam Waves Live Wallpaper for Android
    Adrian Courrèges
  3776. Website Makeover
    Adrian Courrèges
  3777. Exp3D for Android and Web-Browser
    Adrian Courrèges
  3778. 5.1 sound with nForce chipset under Feisty
    Adrian Courrèges
  3779. Ludum Dare 26
    Big Bad Wofl
  3780. Wherefore art I?
    Big Bad Wofl
  3781. What have I been doing?
    Big Bad Wofl
  3782. Really need a name for this thing...
    Big Bad Wofl
  3783. Announcing [Insert Name Here]
    Big Bad Wofl
  3784. The Final Secret
    Big Bad Wofl
  3785. It's been a good run
    Big Bad Wofl
  3786. More Secrets
    Big Bad Wofl
  3787. Secret Project
    Big Bad Wofl
  3788. Morf Feedback
    Big Bad Wofl
  3789. Random World Generator
    Big Bad Wofl
  3790. Morf is back!
    Big Bad Wofl
  3791. Random River Generation
    Big Bad Wofl
  3792. Play Morf Now!
    Big Bad Wofl
  3793. Morf, JavaScript and Laziness
    Big Bad Wofl
  3794. Well, would you look at that
    Big Bad Wofl
  3795. Morf: Alpha Version
    Big Bad Wofl
  3796. Morf
    Big Bad Wofl
  3797. The Official Website is up!
    Big Bad Wofl
  3798. Terrain Coloring and Trees
    Big Bad Wofl
  3799. Display Lists and Combination Shaders
    Big Bad Wofl
  3800. Shaders
    Big Bad Wofl
  3801. Ludum Dare 25: Post Mortem
    Big Bad Wofl
  3802. Ludum Dare: 13 hours
    Big Bad Wofl
  3803. Ludum Dare: 11 hours
    Big Bad Wofl
  3804. The Bottom Feeder Has Moved On!
    The Bottom Feeder
  3805. Getting Sweet Patron Money On the Modern Internet
    The Bottom Feeder
  3806. Queen's Wish Is Out. Here's Why It's So Weird!
    The Bottom Feeder
  3807. I Am the Cheapest Bastard In Indie Games
    The Bottom Feeder
  3808. Why All Of Our Games Look Like Crap
    The Bottom Feeder
  3809. Make Them Want. Delay. Fulfill. Repeat.
    The Bottom Feeder
  3810. The Glorious, Profitable, Inescapable Art of Addiction
    The Bottom Feeder
  3811. We Did Our First Kickstarter! And It Worked!
    The Bottom Feeder
  3812. Divinity: Original Sin 2 and the Rewards of Doing One Hard Thing Right
    The Bottom Feeder
  3813. I Gave a Big Talk On Indie Games and It's Pretty Good.
    The Bottom Feeder
  3814. We Released Avernum 3: Ruined World.
    The Bottom Feeder
  3815. Cuphead, Cruelty, and Selling Unfairness to You.
    The Bottom Feeder
  3816. I Settle All Video Game Arguments, Part 2: What Is a Game?
    The Bottom Feeder
  3817. I Settle All Video Game Arguments, Part 1: Game Reviews
    The Bottom Feeder
  3818. Avernum 3, Remasters, and the Joy of Owning Your Work.
    The Bottom Feeder
  3819. The Life and Merciful Death of the Fad Controller
    The Bottom Feeder
  3820. Persona 5, Cartoon Cats, Depthless Evil, and Dating Your Teacher.
    The Bottom Feeder
  3821. Games Have Too Many Words: A Case Study.
    The Bottom Feeder
  3822. Does Your Video Game Have Too Many Words? (Yeah, Probably.)
    The Bottom Feeder
  3823. Writing Indie Games Is Like Being a Musician. In the Bad Way.
    The Bottom Feeder
  3824. We Are No Longer Supporting Android. Sigh.
    The Bottom Feeder
  3825. A Very Long Post About How to Become a Creator.
    The Bottom Feeder
  3826. We Released Avadon 3! (Also, a Few Words About Free Time)
    The Bottom Feeder
  3827. No, Video Games Aren't Art. We're BETTER.
    The Bottom Feeder
  3828. To Be a Pro is to Be Abused.
    The Bottom Feeder
  3829. Quaker principles line up quite well with modern...
    kottke.org
  3830. The Revolt Against the Girl Bosses&#8230;...
    kottke.org
  3831. The Night Witches: The Female Nazi Hunters of WWII
    kottke.org
  3832. 25 Books That Capture This American Moment . They...
    kottke.org
  3833. Software Developers Say AI Is Rotting Their Brains ....
    kottke.org
  3834. On &#8220;rich guy has an opinion&#8221; journalism ,...
    kottke.org
  3835. World History Timeline
    kottke.org
  3836. New York’s Neue Galerie Will Merge With the Metropolitan...
    kottke.org
  3837. The Neanderthal dentist: archaeologists found evidence...
    kottke.org
  3838. Sarah Rose (who is blind): &#8220;Meta glasses are...
    kottke.org
  3839. How Russell Vought Became the Shadow President
    kottke.org
  3840. Omg, Amazon Prime inserted an ad for Febreze in the...
    kottke.org
  3841. A very good, very 2026 headline: Japan Runs Out of Robot...
    kottke.org
  3842. When Your Participation Is Decoration
    kottke.org
  3843. &#8220;I believe in myself. That&#8217;s why I...
    kottke.org
  3844. Shape of Dreams (Zendaya × Spike Jonze)
    kottke.org
  3845. The Guardian asked authors, critics, and academics to...
    kottke.org
  3846. The 2026 National Recording Registry inductees were...
    kottke.org
  3847. This Tiny Celestial Body Past Pluto Shouldn’t Have an...
    kottke.org
  3848. A Moment That Changed Me: I Saw My First Total Solar...
    kottke.org
  3849. What Childhood Folklore Did You Learn As a Kid?
    kottke.org
  3850. I Want to Live Like Costco People . &#8220;Embracing the...
    kottke.org
  3851. Robin Sloan writes about the personalized, AI-written,...
    kottke.org
  3852. What Would J.R.R. Tolkien Think of Palantir?
    kottke.org
  3853. Meet the Sad Wives of AI . &#8220;Princess Diana...
    kottke.org
  3854. Just dropped: Foo Fighters&#8217; Tiny Desk Concert ....
    kottke.org
  3855. Stop-Motion Lego Dr. Strangelove
    kottke.org
  3856. Adam Serwer : &#8220;Violence serves an authoritarian...
    kottke.org
  3857. A map of the regions of the US , as voted on by Reddit...
    kottke.org
  3858. &#8220;So, at about 14, I became the team&#8217;s...
    kottke.org
  3859. The World Press Freedom Index at Global 25-Year Low
    kottke.org
  3860. Sounds of the 60s: the IBM 1401 (punchcard collation,...
    kottke.org
  3861. Jamelle Bouie thinks Alexandria Ocasio-Cortez is...
    kottke.org
  3862. Being Fed Content
    kottke.org
  3863. Study: &#8220;A few weeks of X&#8217;s algorithm can...
    kottke.org
  3864. Can You See the World When You Close Your Eyes?
    kottke.org
  3865. We’re Diversifying the University by Hiring More...
    kottke.org
  3866. Digg has (sorta) relaunched (again) and instead of an...
    kottke.org
  3867. How NASA Built Artemis II&#8217;s Fault-Tolerant...
    kottke.org
  3868. Remember Desktop Tower Defense ? I played it for a bit...
    kottke.org
  3869. Taken : this is a web page that shows how much data your...
    kottke.org
  3870. Mesmerizing 4K Video of a Cat-5 Super Typhoon
    kottke.org
  3871. Interesting thread about why rural towns don&#8217;t...
    kottke.org
  3872. Wallace & Gromit 24/7 Livestream
    kottke.org
  3873. People Who Don&#8217;t Like People Are Making All of Our...
    kottke.org
  3874. Grandma Stand
    kottke.org
  3875. Where are the public benches on the internet?...
    kottke.org
  3876. Now open in NYC: a pop-up called The Donald J. Trump and...
    kottke.org
  3877. The Hidden Cassettes . &#8220;This is going to sound...
    kottke.org
  3878. In 1951, the Civil Rights Congress submitted a petition...
    kottke.org
  3879. Someone in a private forum I belong to mentioned fountain...
    kottke.org
  3880. Wowsabout!
    kottke.org
  3881. An analysis of 18 years of Guardian blind dates ....
    kottke.org
  3882. Pioneering abstract artist Hilma af Klint&#8217;s...
    kottke.org
  3883. The Design Evolution of Screwdriver Handles
    kottke.org
  3884. What Can We Do About Partisan Gerrymandering? Jamelle...
    kottke.org
  3885. Nolen Royalty : &#8220;My latest project is Marc...
    kottke.org
  3886. The 2025 Alaskan Tsunami That Measured 1578 Feet Tall
    kottke.org
  3887. Prophecy At 1420 MHz is the first single from Boards of...
    kottke.org
  3888. It&#8217;s David Attenborough&#8217;s 100th birthday...
    kottke.org
  3889. Dragoncatcher: Laying it on thick
    Robin Sloan
  3890. Dragoncatcher: News travels too fast these days
    Robin Sloan
  3891. Dragoncatcher: Referer reality
    Robin Sloan
  3892. Dragoncatcher: Claude Managed Agents feature request
    Robin Sloan
  3893. Dragoncatcher: Tone control, part 2
    Robin Sloan
  3894. Dragoncatcher: Talkie and Claude (no, the other one)
    Robin Sloan
  3895. Shopkeeper Rampant
    Robin Sloan
  3896. Dragoncatcher: The milestone of Gemma 4
    Robin Sloan
  3897. Dragoncatcher: Tinfoil
    Robin Sloan
  3898. Dragoncatcher: Reasoning models don't so much think as navigate
    Robin Sloan
  3899. Dragoncatcher: The Galactica option
    Robin Sloan
  3900. Dragoncatcher: Sweat the details
    Robin Sloan
  3901. Dragoncatcher: The bat of fate
    Robin Sloan
  3902. Winter Garden: Where is it like to be a language model?
    Robin Sloan
  3903. Dragoncatcher: Vector voxels
    Robin Sloan
  3904. Dragoncatcher: Cosleuth
    Robin Sloan
  3905. Dragoncatcher: Elemental content
    Robin Sloan
  3906. Good trains
    Robin Sloan
  3907. Dragoncatcher: Wrangler init woes
    Robin Sloan
  3908. Dragoncatcher: Maybe the G in AGI stands for Gemini
    Robin Sloan
  3909. Does meditation experience improve success with the jhanas?
    Nadia Asparouhova
  3910. How to do the jhanas
    Nadia Asparouhova
  3911. Working notes for Summer of Protocols
    Nadia Asparouhova
  3912. Explaining tech’s notion of talent scarcity
    Nadia Asparouhova
  3913. Mapping digital worlds
    Nadia Asparouhova
  3914. Early stage funding markets for science - an analysis
    Nadia Asparouhova
  3915. Mapping out the tribes of climate
    Nadia Asparouhova
  3916. Cultivating agency
    Nadia Asparouhova
  3917. Idea machines
    Nadia Asparouhova
  3918. Understanding science funding in tech, 2011-2021
    Nadia Asparouhova
  3919. Passkey transfer
    Some Bits: Nelson's Linkblog
  3920. Reddit Russian propaganda
    Some Bits: Nelson's Linkblog
  3921. NVME erasing
    Some Bits: Nelson's Linkblog
  3922. 2fa 1337
    Some Bits: Nelson's Linkblog
  3923. USB Cheat Sheet
    Some Bits: Nelson's Linkblog
  3924. xdg-ninja
    Some Bits: Nelson's Linkblog
  3925. Kyle Kingsbury Podcast Podcast
    Some Bits: Nelson's Linkblog
  3926. Is GitHub Cooked?
    Some Bits: Nelson's Linkblog
  3927. Containers vs VMs
    Some Bits: Nelson's Linkblog
  3928. India health survey (PDF)
    Some Bits: Nelson's Linkblog
  3929. Medicat USB
    Some Bits: Nelson's Linkblog
  3930. RNGdle
    Some Bits: Nelson's Linkblog
  3931. Grass Valley welcome arch
    Some Bits: Nelson's Linkblog
  3932. Oil Refineries
    Some Bits: Nelson's Linkblog
  3933. AI goblins
    Some Bits: Nelson's Linkblog
  3934. Sniffies $100M
    Some Bits: Nelson's Linkblog
  3935. 1Password + Flatpak browser
    Some Bits: Nelson's Linkblog
  3936. rpm-ostree
    Some Bits: Nelson's Linkblog
  3937. GitHub update
    Some Bits: Nelson's Linkblog
  3938. NSF board fired
    Some Bits: Nelson's Linkblog
  3939. Linux VRAM management
    Some Bits: Nelson's Linkblog
  3940. Trump vs DAF
    Some Bits: Nelson's Linkblog
  3941. Gay Kaiser scandal
    Some Bits: Nelson's Linkblog
  3942. Duncan Grant / Bathing
    Some Bits: Nelson's Linkblog
  3943. Claude postmortem
    Some Bits: Nelson's Linkblog
  3944. Building a cloud
    Some Bits: Nelson's Linkblog
  3945. Series A for exe.dev
    Some Bits: Nelson's Linkblog
  3946. wsl9x
    Some Bits: Nelson's Linkblog
  3947. 1966 Sip-In
    Some Bits: Nelson's Linkblog
  3948. pi.dev
    Some Bits: Nelson's Linkblog
  3949. The Conversation (free)
    Some Bits: Nelson's Linkblog
  3950. Biennale corruption
    Some Bits: Nelson's Linkblog
  3951. South (1959 TV)
    Some Bits: Nelson's Linkblog
  3952. Jukka and Tane (NSFW)
    Some Bits: Nelson's Linkblog
  3953. AIs in Math
    Some Bits: Nelson's Linkblog
  3954. smol machines
    Some Bits: Nelson's Linkblog
  3955. Unsloth Qwen3.6
    Some Bits: Nelson's Linkblog
  3956. Gyro monorail
    Some Bits: Nelson's Linkblog
  3957. AI risks to the Internet
    Some Bits: Nelson's Linkblog
  3958. The best response is to stop
    Some Bits: Nelson's Linkblog
  3959. CRPG Romance
    Some Bits: Nelson's Linkblog
  3960. 63 Chinese Cuisines
    Some Bits: Nelson's Linkblog
  3961. anemoia
    Some Bits: Nelson's Linkblog
  3962. plastic, prism, void
    Some Bits: Nelson's Linkblog
  3963. Light a Candle for Claude
    Some Bits: Nelson's Linkblog
  3964. Crunchy Chili Crisps
    Some Bits: Nelson's Linkblog
  3965. Charcuterie
    Some Bits: Nelson's Linkblog
  3966. ugetty
    Some Bits: Nelson's Linkblog
  3967. Chinese Cooking Demystified
    Some Bits: Nelson's Linkblog
  3968. I+G for savory flavor
    Some Bits: Nelson's Linkblog
  3969. The Future of Everything is Lies, I Guess
    Some Bits: Nelson's Linkblog
  3970. Switching to OpenStreetMap
    Some Bits: Nelson's Linkblog
  3971. Hunky Jesus 2026 (NSFWish)
    Some Bits: Nelson's Linkblog
  3972. Farrow on Altman
    Some Bits: Nelson's Linkblog
  3973. guppylm
    Some Bits: Nelson's Linkblog
  3974. Adobe fuckery
    Some Bits: Nelson's Linkblog
  3975. Jujutsu Tutorial
    Some Bits: Nelson's Linkblog
  3976. RIP Stuey Weills
    Some Bits: Nelson's Linkblog
  3977. AI security reviews
    Some Bits: Nelson's Linkblog
  3978. Learn and Test DMARC
    Some Bits: Nelson's Linkblog
  3979. GrindrPlus
    Some Bits: Nelson's Linkblog
  3980. Moonfrost
    Some Bits: Nelson's Linkblog
  3981. White House terrible app
    Some Bits: Nelson's Linkblog
  3982. Homocore anthology
    Some Bits: Nelson's Linkblog
  3983. Understanding passkeys
    Some Bits: Nelson's Linkblog
  3984. Sahlins biography
    Some Bits: Nelson's Linkblog
  3985. Squares in Squares
    Some Bits: Nelson's Linkblog
  3986. about exe.dev
    Some Bits: Nelson's Linkblog
  3987. curl > /dev/sda
    Some Bits: Nelson's Linkblog
  3988. pr-review
    Some Bits: Nelson's Linkblog
  3989. Left-right split in Paris
    Some Bits: Nelson's Linkblog
  3990. DMARC statistics
    Some Bits: Nelson's Linkblog
  3991. Ubuntu connectivity check down
    Some Bits: Nelson's Linkblog
  3992. Bluesky $100M VC funding
    Some Bits: Nelson's Linkblog
  3993. BYD Flash Charging
    Some Bits: Nelson's Linkblog
  3994. Tesla AI crash
    Some Bits: Nelson's Linkblog
  3995. In search of Banksy
    Some Bits: Nelson's Linkblog
  3996. "Mark Lawrence" AI slop
    Some Bits: Nelson's Linkblog
  3997. Google DNS cache flush
    Some Bits: Nelson's Linkblog
  3998. MALUS license washing
    Some Bits: Nelson's Linkblog
  3999. ZyncPDF
    Some Bits: Nelson's Linkblog
  4000. Landscape khipu
    Some Bits: Nelson's Linkblog
  4001. Perma.cc
    Some Bits: Nelson's Linkblog
  4002. Modular robots
    Some Bits: Nelson's Linkblog
  4003. AI ethics and market
    Some Bits: Nelson's Linkblog
  4004. Parseword
    Some Bits: Nelson's Linkblog
  4005. Attensity!
    Some Bits: Nelson's Linkblog
  4006. Trump pardon industry
    Some Bits: Nelson's Linkblog
  4007. Superpowers
    Some Bits: Nelson's Linkblog
  4008. GitHub status
    Some Bits: Nelson's Linkblog
  4009. Japanese Glory Hole
    Some Bits: Nelson's Linkblog
  4010. Vietnamese Cajun
    Some Bits: Nelson's Linkblog
  4011. Agentic Engineering Patterns
    Some Bits: Nelson's Linkblog
  4012. Joy in resistance
    Some Bits: Nelson's Linkblog
  4013. Musk PAC voter fraud
    Some Bits: Nelson's Linkblog
  4014. Andor and US fascism
    Some Bits: Nelson's Linkblog
  4015. IRCv3
    Some Bits: Nelson's Linkblog
  4016. OpenFactBook
    Some Bits: Nelson's Linkblog
  4017. "Remigration"
    Some Bits: Nelson's Linkblog
  4018. Distillation attacks
    Some Bits: Nelson's Linkblog
  4019. Black &#038; White Snacks
    One Foot Tsunami
  4020. 💧 Quick and Creepy
    One Foot Tsunami
  4021. The Septuagenarian Resident
    One Foot Tsunami
  4022. At Least AC/DC Would Be Proud
    One Foot Tsunami
  4023. 💧 Happy Mother’s Lengths of Time
    One Foot Tsunami
  4024. One Hell of a Pop Quiz
    One Foot Tsunami
  4025. Popes — They&#8217;re Just Like Us!
    One Foot Tsunami
  4026. Divorce Registries
    One Foot Tsunami
  4027. Foiling Online Age Checks
    One Foot Tsunami
  4028. Oracle Park’s Bogus 9-9-9 Challenge Has Disappeared
    One Foot Tsunami
  4029. A Very Poor Trade
    One Foot Tsunami
  4030. Renea Gamble Prevails
    One Foot Tsunami
  4031. Pam From Wenatchee Made a Hologram
    One Foot Tsunami
  4032. The Bartered Vasectomy
    One Foot Tsunami
  4033. 💧 Sub-Two Sabastian Sawe
    One Foot Tsunami
  4034. A Very Fusilli Plan
    One Foot Tsunami
  4035. Dropping in Unannounced
    One Foot Tsunami
  4036. 💧 Sectional 42
    One Foot Tsunami
  4037. We’ve Got to Hang Our Hats on Something
    One Foot Tsunami
  4038. Snagged on a Giant C
    One Foot Tsunami
  4039. Vehicles Crushed by Snow
    One Foot Tsunami
  4040. 💧 The Magawa Monument Is Made of Stone
    One Foot Tsunami
  4041. Pivoting From Shoes to Artificial Intelligence
    One Foot Tsunami
  4042. 💧 Stay in Your Lane, Paperless Post
    One Foot Tsunami
  4043. A Massive Magawa
    One Foot Tsunami
  4044. Where Is Everybody
    Today in Tabs
  4045. It's Lamer Than You Think
    Today in Tabs
  4046. Dopefish
    Today in Tabs
  4047. Goblin Problem
    Today in Tabs
  4048. Ballroom Twits
    Today in Tabs
  4049. Papal Bull
    Today in Tabs
  4050. 5.6 Million Bees
    Today in Tabs
  4051. Everything Is Hacked Now
    Today in Tabs
  4052. The Future of Football Starts Today
    Today in Tabs
  4053. Who Goes AI?
    Today in Tabs
  4054. Aham for the Mild-Built
    Today in Tabs
  4055. Dick Hebdige Explained Dinergoth in 1979
    Today in Tabs
  4056. Sokath, His Eyes Uncovered
    Today in Tabs
  4057. Purity Supreme
    Today in Tabs
  4058. They Don't Care
    Today in Tabs
  4059. The Lowbrow Harper's
    Today in Tabs
  4060. A.I. Isn't People
    Today in Tabs
  4061. The Assassination of The Washington Post by the Coward Jeff Bezos
    Today in Tabs
  4062. Masks Off
    Today in Tabs
  4063. Welcome to the Resistance, Driving Range Guys
    Today in Tabs
  4064. Your CrossFit App Doesn’t Know What You Did
    Perfection Kills
  4065. CrossFit training in the age of AI
    Perfection Kills
  4066. Overnight success
    Perfection Kills
  4067. What’s my XENOM score?
    Perfection Kills
  4068. Reflections on training, 2025 → ‘26
    Perfection Kills
  4069. CrossFit tracking app but… you’re in control?
    Perfection Kills
  4070. My Fitness: from spreadsheet to an app
    Perfection Kills
  4071. PRzilla: CrossFit AI companion
    Perfection Kills
  4072. The science of Vipassana
    Perfection Kills
  4073. Vipassana through the modern lens
    Perfection Kills
  4074. How can coffee taste peachy?
    April Cools' Club
  4075. 🙏: please or thank you?
    April Cools' Club
  4076. My year of reading Chinese history
    April Cools' Club
  4077. Chants of Sennaar: A Review
    April Cools' Club
  4078. On Trees. But Not Those Trees.
    April Cools' Club
  4079. Myst's Minecart Maze Is Great Actually
    April Cools' Club
  4080. Don't Call It a Comedown
    April Cools' Club
  4081. Chicago vs New York Pizza is the Wrong Argument
    April Cools' Club
  4082. The Self-Cancelling Subscription
    April Cools' Club
  4083. The underrated benefits of always having oatmeal at lunch
    April Cools' Club
  4084. Puzzlehunts
    April Cools' Club
  4085. My Experience As A Rice Farmer
    April Cools' Club
  4086. Come ho programmato un videogioco per Game Boy sull’amicizia con GB Studio
    April Cools' Club
  4087. Product review: Kvikk Lunsj
    April Cools' Club
  4088. I guess I cook now
    April Cools' Club
  4089. How to decorate a child's birthday cake
    April Cools' Club
  4090. Digitisation is process optimisation
    April Cools' Club
  4091. I listened to the 1001 (?) albums I should listen to before I die
    April Cools' Club
  4092. 3D-printing a Trombone
    April Cools' Club
  4093. A non-exhaustive list of stuff I recommend
    April Cools' Club
  4094. The Irrational Decision—A Book Review
    April Cools' Club
  4095. How to Get Better at Guitar
    April Cools' Club
  4096. Celebration of Sunshine
    April Cools' Club
  4097. Personal Mineclonia World Tour
    April Cools' Club
  4098. Spoken Latin
    April Cools' Club
  4099. Language-learning anecdotes
    April Cools' Club
  4100. This music seems to be in the air...
    April Cools' Club
  4101. You should buy a meat slicer
    April Cools' Club
  4102. Does Baby Have Hat
    April Cools' Club
  4103. Dries van Noten in Five Looks
    April Cools' Club
  4104. My coffee setup
    April Cools' Club
  4105. My Adidas
    April Cools' Club
  4106. Find joy in the boring bits of life
    April Cools' Club
  4107. The Paris of our dreams
    April Cools' Club
  4108. XORry Not Sorry: The Most Amusing Security Flaws I've Discovered
    April Cools' Club
  4109. Gamer Games for Non-Gamers
    April Cools' Club
  4110. Overengineering an Obsidian dashboard to get better at Marvel Snap
    April Cools' Club
  4111. Leveraging Spaced Repetition to Power My Weekly Newsletter
    April Cools' Club
  4112. nimi sin
    April Cools' Club
  4113. Impulse Purchases
    April Cools' Club
  4114. Egg mayo sandwich optimisation
    April Cools' Club
  4115. Dynamic Graphs
    April Cools' Club
  4116. How to Run a Table Top Roleplaying Meetup
    April Cools' Club
  4117. A rough review of Capers Jones' Applied Software Measurement
    April Cools' Club
  4118. Come non ho riparato la lavatrice che si riempiva d’acqua da spenta
    April Cools' Club
  4119. What's the yield on my stonks
    April Cools' Club
  4120. Tout le monde déteste l'IA
    April Cools' Club
  4121. A New Hope
    April Cools' Club
  4122. The WiFi only works when it's raining
    April Cools' Club
  4123. Choir rehearsal score locations
    April Cools' Club
  4124. Some easy recipes
    April Cools' Club
  4125. The Tale of Daniel
    April Cools' Club
  4126. How I became a gardener
    April Cools' Club
  4127. Books, Games and Movies
    April Cools' Club
  4128. Decaf is good, actually
    April Cools' Club
  4129. Yeah, I Skate(board)
    April Cools' Club
  4130. A tour of my screenshots folder
    April Cools' Club
  4131. Kratky in the basement
    April Cools' Club
  4132. Takerufuji made history
    April Cools' Club
  4133. Adaptive Plasticity and Life History Theory
    April Cools' Club
  4134. Unusual Tips for Parenting Toddlers
    April Cools' Club
  4135. Can it Creami?
    April Cools' Club
  4136. the saga of Nat
    April Cools' Club
  4137. Making crochet cacti
    April Cools' Club
  4138. The Spice Didn't Always Flow
    April Cools' Club
  4139. Discovering coffee in Toulouse?
    April Cools' Club
  4140. Mediocrity can be a sign of excellence
    April Cools' Club
  4141. Ten weird things you can buy online (and why you would)
    April Cools' Club
  4142. Simple chicken rice
    April Cools' Club
  4143. We're Knot Friends
    April Cools' Club
  4144. What's in a username?
    April Cools' Club
  4145. 100 Incredible Tofu Recipes
    April Cools' Club
  4146. The right tempo for renaissance polyphony
    April Cools' Club
  4147. Marathon food
    April Cools' Club
  4148. I ❤️ Microscopes
    April Cools' Club
  4149. Cocktails
    April Cools' Club
  4150. To ace exams, practice the easy questions
    April Cools' Club
  4151. On Error
    April Cools' Club
  4152. Midaregami
    April Cools' Club
  4153. Ubj gb ernq EBG13 (How to read ROT13)
    April Cools' Club
  4154. You Should Charge More
    April Cools' Club
  4155. Coffee and Me: A Seven Year Love Affair
    April Cools' Club
  4156. Vihaan tekoälyä
    moser’s frame shop
  4157. Я ненавижу ИИ
    moser’s frame shop
  4158. Je hais l’IA.
    moser’s frame shop
  4159. Odio la IA
    moser’s frame shop
  4160. Odio l’IA
    moser’s frame shop
  4161. The Kirby Frame
    moser’s frame shop
  4162. Eu Odeio IA
    moser’s frame shop
  4163. I Am An AI Hater
    moser’s frame shop
  4164. Life During Class Wartime
    ongoing by Tim Bray
  4165. Corey’s Captives
    ongoing by Tim Bray
  4166. Spring Evening
    ongoing by Tim Bray
  4167. Password Manager Angst
    ongoing by Tim Bray
  4168. Long Links
    ongoing by Tim Bray
  4169. Nash Burns Saves the Day
    ongoing by Tim Bray
  4170. Pure Sound Please
    ongoing by Tim Bray
  4171. Because Algospeak
    ongoing by Tim Bray
  4172. Kansas and AI
    ongoing by Tim Bray
  4173. Crocuses of 2026
    ongoing by Tim Bray
  4174. Open Source and GenAI?
    ongoing by Tim Bray
  4175. Quamina + Claude, Case 2
    ongoing by Tim Bray
  4176. Quamina + Claude, Case 1
    ongoing by Tim Bray
  4177. Long Links
    ongoing by Tim Bray
  4178. Quamina v2.0.0
    ongoing by Tim Bray
  4179. Losing 1½ Million Lines of Go
    ongoing by Tim Bray
  4180. Regexp Lessons
    ongoing by Tim Bray
  4181. Humanist Plumbing
    ongoing by Tim Bray
  4182. After the Bubble
    ongoing by Tim Bray
  4183. Tracy Numbers
    ongoing by Tim Bray
  4184. Hearts and Minds: An Ambivalent Review of “Project Hail Mary”
    No Moods, Ads or Cutesy Fucking Icons
  4185. Periscope Depth
    No Moods, Ads or Cutesy Fucking Icons
  4186. The Plur1bus Solution
    No Moods, Ads or Cutesy Fucking Icons
  4187. Siren Songs
    No Moods, Ads or Cutesy Fucking Icons
  4188. No Obituary. Just an End.
    No Moods, Ads or Cutesy Fucking Icons
  4189. It Awaits Your Experiments.
    No Moods, Ads or Cutesy Fucking Icons
  4190. A Synopsis of Squid
    No Moods, Ads or Cutesy Fucking Icons
  4191. Beautiful Things.
    No Moods, Ads or Cutesy Fucking Icons
  4192. Perplexity: Hail Mary
    No Moods, Ads or Cutesy Fucking Icons
  4193. Outtake
    No Moods, Ads or Cutesy Fucking Icons
  4194. Hope for the New Year.
    No Moods, Ads or Cutesy Fucking Icons
  4195. Born in Pain and Sweat and Pee: the 2024 Gallery Update
    No Moods, Ads or Cutesy Fucking Icons
  4196. &#8220;The Pilot Enters the Core.
    No Moods, Ads or Cutesy Fucking Icons
  4197. Ass Man.
    No Moods, Ads or Cutesy Fucking Icons
  4198. Meet the New Boss. Same as the Old Boss.
    No Moods, Ads or Cutesy Fucking Icons
  4199. Some People Just Want to Watch the Internet Burn.
    No Moods, Ads or Cutesy Fucking Icons
  4200. The Three-Bragger Problem
    No Moods, Ads or Cutesy Fucking Icons
  4201. Two-Step Forwards, Ten Years Back
    No Moods, Ads or Cutesy Fucking Icons
  4202. Alevtina and Tamara and Lyonka, Oh My!
    No Moods, Ads or Cutesy Fucking Icons
  4203. Meet the New Boss. Same as the Old Boss.
    No Moods, Ads or Cutesy Fucking Icons
  4204. macOS Terminal - still missing the mark Apple!
    /dev/dump
  4205. Golang sync.Cond vs. Channel...
    /dev/dump
  4206. Go modules, so much promise, so much busted
    /dev/dump
  4207. Letter to Duncan Hunter (Immigration)
    /dev/dump
  4208. Self Publishing Lessons
    /dev/dump
  4209. Altering the deal... again....
    /dev/dump
  4210. Not Abandoning GitHub *yet*
    /dev/dump
  4211. Microsoft Buying GitHub Would be Bad
    /dev/dump
  4212. No, Nanomsg is NOT dead
    /dev/dump
  4213. Why I'm Boycotting Crypto Currencies
    /dev/dump
  4214. Small Business Accounting Software Woes
    /dev/dump
  4215. TLS close-notify .... what were they thinking?
    /dev/dump
  4216. CMake ExternalProject_add In Libraries
    /dev/dump
  4217. Licensing... again....
    /dev/dump
  4218. MacOS X Mystery (Challenge)
    /dev/dump
  4219. Security Advice to IoT Firmware Engineers
    /dev/dump
  4220. Microsoft Hates My Name (Not Me, Just My Name)
    /dev/dump
  4221. Leaving github
    /dev/dump
  4222. Stepping Down
    /dev/dump
  4223. What Microsoft Can Do to Make Me Hate Windows a Little Less
    /dev/dump
  4224. On Misunderstandings
    /dev/dump
  4225. A Space Shooter in Curses
    /dev/dump
  4226. Fun with terminals, character sets, Unicode, and Go
    /dev/dump
  4227. Tcell - Terminal functionality for Pure Go apps
    /dev/dump
  4228. On Go, Portability, and System Interfaces
    /dev/dump
  4229. Elevation Correction
    Alex Harsányi
  4230. A Racket Array Tutorial
    Alex Harsányi
  4231. Pumpkin Plot
    Alex Harsányi
  4232. The Wolf, the Goat, and the Cabbage
    Alex Harsányi
  4233. Timezone Lookup Revisited
    Alex Harsányi
  4234. Synchronizing FIT files using a Raspberry Pi
    Alex Harsányi
  4235. Heat Maps Revisited
    Alex Harsányi
  4236. Asteroids (Gameplay)
    Alex Harsányi
  4237. Asteroids (Game Engine)
    Alex Harsányi
  4238. Screenshots
    Alex Harsányi
  4239. Who Owns the Fish?
    Alex Harsányi
  4240. Shaded Area Plot
    Alex Harsányi
  4241. Box and Whiskers Plot
    Alex Harsányi
  4242. Climb Analysis Tool
    Alex Harsányi
  4243. Plot Animations
    Alex Harsányi
  4244. Space Invaders
    Alex Harsányi
  4245. Rendering the World Map Using the Racket Plot Package
    Alex Harsányi
  4246. Barometric Altitude Measurement
    Alex Harsányi
  4247. Automating Tests for the Plot Package
    Alex Harsányi
  4248. Ishido
    Alex Harsányi
  4249. Markdown View using the Racket editor%
    Alex Harsányi
  4250. Dependency Management in Racket Applications
    Alex Harsányi
  4251. Threshold Analysis in ActivityLog2
    Alex Harsányi
  4252. A Game of Tetris (user interface)
    Alex Harsányi
  4253. A Game of Tetris (gameplay)
    Alex Harsányi
  4254. Dual Axis Plots
    Alex Harsányi
  4255. Custom Rackunit Test Runner
    Alex Harsányi
  4256. Timezone Aware Local Time
    Alex Harsányi
  4257. Interactive Heat Maps
    Alex Harsányi
  4258. Racket Binary Packages
    Alex Harsányi
  4259. Interactive Maps in the DrRacket REPL
    Alex Harsányi
  4260. More Timezone Lookup (loading and saving data)
    Alex Harsányi
  4261. Timezone Lookup (an adventure in program optimization)
    Alex Harsányi
  4262. Timezone Visualization
    Alex Harsányi
  4263. Build Racket Packages with Azure Pipelines
    Alex Harsányi
  4264. Building a GUI Application for the Password Generator
    Alex Harsányi
  4265. Writing a Simple Password Generator in Racket
    Alex Harsányi
  4266. An Overview of Common Racket Data Structures
    Alex Harsányi
  4267. Building a Data Visualization Dashboard in Racket
    Alex Harsányi
  4268. An enhanced text-field% GUI control for Racket
    Alex Harsányi
  4269. Chess Game Using Racket's Pasteboard (part 3)
    Alex Harsányi
  4270. Chess Game Using Racket's Pasteboard (part 2)
    Alex Harsányi
  4271. Chess Game Using Racket's Pasteboard
    Alex Harsányi
  4272. Racket Data Frame Package
    Alex Harsányi
  4273. A Racket GUI Widget to display maps based on OpenStreetMap tiles
    Alex Harsányi
  4274. Running and Cycling Workout Editor
    Alex Harsányi
  4275. Arduino 433Mhz Receiver -- Reading Keyfobs
    Alex Harsányi
  4276. Interactive Overlays With the Racket Plot Package -- Update
    Alex Harsányi
  4277. Arduino Inclinometer Improvements
    Alex Harsányi
  4278. Interactive Overlays With the Racket Plot Package
    Alex Harsányi
  4279. Changing Built-in Racket Packages
    Alex Harsányi
  4280. Equipment Usage and Costs
    Alex Harsányi
  4281. Running and Outdoor Temperature
    Alex Harsányi
  4282. Arduino Inclinometer
    Alex Harsányi
  4283. Fatigue and Running Form
    Alex Harsányi
  4284. Quantifying Fatigue
    Alex Harsányi
  4285. Bike Trainer
    Alex Harsányi
  4286. Marathon Training 2017 Statistics
    Alex Harsányi
  4287. Introducing ActivityLog2
    Alex Harsányi
  4288. Making myself uncomfortable again
    Andreas Kling
  4289. MutexProtected: A C++ Pattern for Easier Concurrency
    Andreas Kling
  4290. Excellence is a habit, but so is failure
    Andreas Kling
  4291. How SerenityOS declares ssize_t
    Andreas Kling
  4292. 15 Minutes Every Day
    Andreas Kling
  4293. How I make a living working on SerenityOS
    Andreas Kling
  4294. Ladybird: A new cross-platform browser project
    Andreas Kling
  4295. Memory safety for SerenityOS
    Andreas Kling
  4296. I quit my job to focus on SerenityOS full time
    Andreas Kling
  4297. Smarter C/C++ inlining with __attribute__((flatten))
    Andreas Kling
  4298. X84 Telnet Server
    BogBoa
  4299. New Development Server
    BogBoa
  4300. For the Love of Coffee, Gadgets, and Python
    BogBoa
  4301. Desktop Linux
    BogBoa
  4302. Remove the "close window?" prompt from Gnome-Terminal
    BogBoa
  4303. The Web Client
    BogBoa
  4304. Unpacking WebSocket Frames Cont.
    BogBoa
  4305. Unpacking a WebSocket Frame
    BogBoa
  4306. WebSocket RFC 6455 Handshake
    BogBoa
  4307. WebSockets
    BogBoa
  4308. Zomborgs
    BogBoa
  4309. Failure is an Option
    BogBoa
  4310. Character Work
    BogBoa
  4311. Thoughts on Serialization
    BogBoa
  4312. Crawling a Graph
    BogBoa
  4313. Visualizing Data
    BogBoa
  4314. The Universe is a Diamond
    BogBoa
  4315. Python 3
    BogBoa
  4316. WebSockets
    BogBoa
  4317. Chrome WebSocket Protocol Update
    BogBoa
  4318. HTTP Server
    BogBoa
  4319. Netboa
    BogBoa
  4320. The Plague that struck Azeroth
    BogBoa
  4321. Not Actually Dead
    BogBoa
  4322. Thinking about Miniboa 2.0
    BogBoa
  4323. The Cold War Roots of the African Swine Flu Plague
    China Matters
  4324. Wormwood and Gall: The Frank Olson Story that Errol Morris Missed
    China Matters
  4325. The Trillion-Dollar Grift: The Long-Term Plan for US-China Decoupling
    China Matters
  4326. The Crimes of Lola Montes
    China Matters
  4327. China and the Libyan Muddle and Why Qaddafi Went Down
    China Matters
  4328. 80 Years of Injustice: The Joint, Serial, and Ongoing Betrayal of Korea by the United States and Japan
    China Matters
  4329. Coddling Japan and Coveting Okinawa: Kennan and MacArthur set the course of North Asian history post-World War II
    China Matters
  4330. America’s Blueprint for War in the South China Sea
    China Matters
  4331. What I Witnessed in 1989 in Beijing
    China Matters
  4332. “Vice”, Dick Cheney’s Ghost, and the Lies of America's Team China War
    China Matters
  4333. October 2018 Taiwan Mainland Affairs Council Public Opinion Polling on Cross Strait Relations
    China Matters
  4334. Debunking the China Debt Trap Myth, Sri Lanka/Hambantota Edition
    China Matters
  4335. The Twelve Days of Christmas...and Elvis
    China Matters
  4336. Joseph Trento's Report on the Pivotal US Role in Creating Japan's Plutonium Stockpile
    China Matters
  4337. Posited link between Zhang Shoucheng suicide and Meng Wanzhou arrest (in Chinese)
    China Matters
  4338. Sri Lanka, Rajapaksa--and China--Back in Geopolitical Play
    China Matters
  4339. August 2018 Republic of China Mainland Affairs Council Survey of Popular Attitudes on Cross Strait Relations
    China Matters
  4340. "Little Reunion": Eileen Chang gets another turn in the revisionist meatgrinder
    China Matters
  4341. Hey! What About Term Limits for the Chinese Communist Party, Xi Jinping??
    China Matters
  4342. Who Lost China? The Secret War Between Hillary Clinton and Barack Obama
    China Matters
  4343. Atomic I/O letters column #164
    Dan's Data
  4344. Atomic I/O letters column #163
    Dan's Data
  4345. Atomic I/O letters column #162
    Dan's Data
  4346. Atomic I/O letters column #161
    Dan's Data
  4347. Atomic I/O letters column #160
    Dan's Data
  4348. Atomic I/O letters column #159
    Dan's Data
  4349. Atomic I/O letters column #158
    Dan's Data
  4350. Atomic I/O letters column #157
    Dan's Data
  4351. Atomic I/O letters column #156
    Dan's Data
  4352. Atomic I/O letters column #155
    Dan's Data
  4353. Atomic I/O letters column #154
    Dan's Data
  4354. Atomic I/O letters column #153
    Dan's Data
  4355. Atomic I/O letters column #152
    Dan's Data
  4356. A comforting lie
    Dan's Data
  4357. Of course you'd download a car. Or a gun!
    Dan's Data
  4358. Atomic I/O letters column #151
    Dan's Data
  4359. Atomic I/O letters column #150
    Dan's Data
  4360. Atomic I/O letters column #149
    Dan's Data
  4361. Atomic I/O letters column #148
    Dan's Data
  4362. Atomic I/O letters column #147
    Dan's Data
  4363. Money for nothing
    Dan's Data
  4364. Atomic I/O letters column #146
    Dan's Data
  4365. Atomic I/O letters column #145
    Dan's Data
  4366. I get letters
    Dan's Data
  4367. Random... ish... numbers
    Dan's Data
  4368. Righteous bits
    Dan's Data
  4369. Atomic I/O letters column #144
    Dan's Data
  4370. Science versus SoftRAM
    Dan's Data
  4371. Seeing past the normal
    Dan's Data
  4372. Atomic I/O letters column #143
    Dan's Data
  4373. Atomic I/O letters column #142
    Dan's Data
  4374. Atomic I/O letters column #141
    Dan's Data
  4375. On the h4xx0ring of p4sswordZ
    Dan's Data
  4376. Warfare. Aliens. Car crashes. ENTERTAINMENT!
    Dan's Data
  4377. Atomic I/O letters column #140
    Dan's Data
  4378. Review: Noontec GigaLink N5 network storage box
    Dan's Data
  4379. Atomic I/O letters column #139
    Dan's Data
  4380. Socialised entertainment
    Dan's Data
  4381. Atomic I/O letters column #138
    Dan's Data
  4382. Boing!
    Dan's Data
  4383. Identical voices and phantom swords
    Dan's Data
  4384. Atomic I/O letters column #137
    Dan's Data
  4385. Review: MC Saite MC-086 mouse
    Dan's Data
  4386. Atomic I/O letters column #136
    Dan's Data
  4387. Atomic I/O letters column #135
    Dan's Data
  4388. If it looks random, it probably isn't
    Dan's Data
  4389. A deadly mouse trap
    Dan's Data
  4390. Atomic I/O letters column #134
    Dan's Data
  4391. Pathfinding to everywhere
    Dan's Data
  4392. 15.16 thousand megabytes per dollar
    Dan's Data
  4393. Grinding myself down
    Dan's Data
  4394. Dan's Data letters #210
    Dan's Data
  4395. Atomic I/O letters column #133
    Dan's Data
  4396. Stomp, don't sprint!
    Dan's Data
  4397. Atomic I/O letters column #132
    Dan's Data
  4398. Review: Miyabi 613 hunting knife
    Dan's Data
  4399. Atomic I/O letters column #131
    Dan's Data
  4400. Welcome to my museum
    Dan's Data
  4401. Atomic I/O letters column #130
    Dan's Data
  4402. Welcome to dreamland
    Dan's Data
  4403. Atomic I/O letters column #129
    Dan's Data
  4404. When you have eliminated the impossible...
    Dan's Data
  4405. Of magic lanterns, and MMORPGs
    Dan's Data
  4406. The death of the manual
    Dan's Data
  4407. Review: PCsensor FS1_P USB foot switch
    Dan's Data
  4408. Atomic I/O letters column #128
    Dan's Data
  4409. Dan's Data letters #209
    Dan's Data
  4410. Atomic I/O letters column #127
    Dan's Data
  4411. Atomic I/O letters column #126
    Dan's Data
  4412. Filenames.WTF
    Dan's Data
  4413. Atomic I/O letters column #125
    Dan's Data
  4414. In Praise of the Fisheye
    Dan's Data
  4415. Atomic I/O letters column #124
    Dan's Data
  4416. A modest censorship proposal
    Dan's Data
  4417. Atomic I/O letters column #123
    Dan's Data
  4418. Atomic I/O letters column #122
    Dan's Data
  4419. Stuck in the foothills
    Dan's Data
  4420. Atomic I/O letters column #121
    Dan's Data
  4421. The newt hits! You die...
    Dan's Data
  4422. Have you wasted enough time today?
    Dan's Data
  4423. Atomic I/O letters column #120
    Dan's Data
  4424. Atomic I/O letters column #119
    Dan's Data
  4425. Big Brother is watching you play
    Dan's Data
  4426. Dan's Data letters #208
    Dan's Data
  4427. Dan's Data letters #207
    Dan's Data
  4428. One-note NPCs
    Dan's Data
  4429. Cannibalise the corpses!
    Dan's Data
  4430. Atomic I/O letters column #118
    Dan's Data
  4431. Atomic I/O letters column #117
    Dan's Data
  4432. Atomic I/O letters column #116
    Dan's Data
  4433. Five trillion bits flying in loose formation
    Dan's Data
  4434. Atomic I/O letters column #115
    Dan's Data
  4435. Game crazy
    Dan's Data
  4436. Alt-tCRASH
    Dan's Data
  4437. Speed kings
    Dan's Data
  4438. The daily grind
    Dan's Data
  4439. Rustfmt-ing Rust
    Featherweight Musings
  4440. My Git and GitHub work flow
    Featherweight Musings
  4441. rustfmt - call for contributions
    Featherweight Musings
  4442. Contributing to Rust
    Featherweight Musings
  4443. New tutorial - arrays and vectors in Rust
    Featherweight Musings
  4444. Graphs in Rust
    Featherweight Musings
  4445. Creating a drop-in replacement for the Rust compiler
    Featherweight Musings
  4446. Recent syntactic changes to Rust
    Featherweight Musings
  4447. My thoughts on Rust in 2015
    Featherweight Musings
  4448. rustaceans.org
    Featherweight Musings
  4449. Notes on training for sport
    Featherweight Musings
  4450. Thoughts on numeric types
    Featherweight Musings
  4451. A gotcha with raw pointers and unsafe code
    Featherweight Musings
  4452. LibHoare - pre- and postconditions in Rust
    Featherweight Musings
  4453. Rust for C++ programmers - part 9: destructuring pt2 - match and borrowing
    Featherweight Musings
  4454. Rust for C++ programmers - part 8: destructuring
    Featherweight Musings
  4455. Rust for C++ programmers - part 7: data types
    Featherweight Musings
  4456. Rust for C++ programmers - part 6: Rc, Gc, and * pointers
    Featherweight Musings
  4457. Rust for C++ programmers - part 5: borrowed references
    Featherweight Musings
  4458. A thought on language design
    Featherweight Musings
  4459. Rust for C++ programmers - part 4: unique pointers
    Featherweight Musings
  4460. Rust for C++ programmers - part 3: primitive types and operators
    Featherweight Musings
  4461. Formatting change
    Featherweight Musings
  4462. Rust for C++ programmers - part 2: control flow
    Featherweight Musings
  4463. Rust for C++ programmers - an intermission - why Rust
    Featherweight Musings
  4464. Cosa è Andato al Prada Doppio Club di Miami
    greg.org: the making of
  4465. Rothko & Parsons At The National Gallery, Curated By Bunny Mellon
    greg.org: the making of
  4466. Better Read #018: Ellsworth Kelly, Notes of 1969
    greg.org: the making of
  4467. I Found An Object And Presented It As Itself Alone
    greg.org: the making of
  4468. Our Guernica Cycle - EB-5, 05.06.2017
    greg.org: the making of
  4469. Untitled (Mnuchin Gallery), 2017?
    greg.org: the making of
  4470. Three Charts Presented In Order Of Increasing Credibility
    greg.org: the making of
  4471. Ellsworth Kelly Dancing Monkey
    greg.org: the making of
  4472. UPDATE: Our Guernica Cycle - Ivanka / Merkel 03.17.2017
    greg.org: the making of
  4473. Talking Walter Hopps, Ferus, & LA with Anne Doran & Deborah Treisman, 10/29 @Alden Projects
    greg.org: the making of
  4474. Better Read #017: Embroidery Trouble Shooting Guide
    greg.org: the making of
  4475. Tommy Hilfiger Capo Personale
    greg.org: the making of
  4476. Untitled (Presidential Seal), 2017
    greg.org: the making of
  4477. Statement-As-Question: How Do You Get Here? From How Is Art History Made?
    greg.org: the making of
  4478. RIP Vern Blosum
    greg.org: the making of
  4479. Untitled (We Privatized All Of Versailles), 2017
    greg.org: the making of
  4480. Untitled (Mnuchin Gallery), 2017
    greg.org: the making of
  4481. Untitled (Boxwood Maze), 1967/2017
    greg.org: the making of
  4482. Erased Kassay JPEG
    greg.org: the making of
  4483. Ruth Asawa BMC Laundry Stamp Drawings
    greg.org: the making of
  4484. Rust and dynamically-sized thin pointers
    John Millikin
  4485. vu128: Efficient variable-length integers
    John Millikin
  4486. Creating TUN/TAP interfaces in Linux
    John Millikin
  4487. Running SunOS 4 in QEMU (SPARC)
    John Millikin
  4488. Improved UNIX socket networking in QEMU 7.2
    John Millikin
  4489. Debugging Win32 binaries in Ghidra via Wine
    John Millikin
  4490. Running BeOS 5 in QEMU (i386)
    John Millikin
  4491. Gmail accepts forged YouTube emails
    John Millikin
  4492. Compacting Lunr search indices
    John Millikin
  4493. JSON is not a YAML subset
    John Millikin
  4494. Stateless Kubernetes overlay networks with IPv6
    John Millikin
  4495. Extending VSCode with WebAssembly
    John Millikin
  4496. Notes on cross-compiling Rust
    John Millikin
  4497. First impressions of Rust
    John Millikin
  4498. Commentary on “Stop Using Encrypted Email”
    John Millikin
  4499. By any other CNAME
    John Millikin
  4500. SRE School: No Haunted Forests
    John Millikin
  4501. (More) Effective Go
    John Millikin
  4502. Error Beneath the WAVs
    John Millikin
  4503. Why I Ripped The Same CD 300 Times
    John Millikin
  4504. Effective gRPC
    John Millikin
  4505. Bazel School: Toolchains
    John Millikin
  4506. Mojibake in Surugaya Javascript
    John Millikin
  4507. UNIX Syscalls
    John Millikin
  4508. SRE School: Health Checking
    John Millikin
  4509. Reddit Front Page (2018)
    John Millikin
  4510. Re:Creators Episode 21
    John Millikin
  4511. SRE School: Instrumentation
    John Millikin
  4512. haskell-cpython: Calling Python libraries from Haskell
    John Millikin
  4513. Monad is not difficult
    John Millikin
  4514. Understanding Iteratees
    John Millikin
  4515. Replay
    NSHipster
  4516. Manim
    NSHipster
  4517. @isolated(any)
    NSHipster
  4518. Uncertain⟨T⟩
    NSHipster
  4519. Model Context Protocol (MCP)
    NSHipster
  4520. Ollama
    NSHipster
  4521. op run
    NSHipster
  4522. As We May Code
    NSHipster
  4523. WWDC 2020
    NSHipster
  4524. Language Server Protocol
    NSHipster
  4525. So Long, Prog21
    Programming in the 21st Century
  4526. Writing Video Games in a Functional Style
    Programming in the 21st Century
  4527. Progress Bars are Surprisingly Difficult
    Programming in the 21st Century
  4528. So Long, Prog21
    Programming in the 21st Century
  4529. Writing Video Games in a Functional Style
    Programming in the 21st Century
  4530. Progress Bars are Surprisingly Difficult
    Programming in the 21st Century
  4531. Rovyvon A5R flashlight diagram
    Push.cx
  4532. TypeID in Lua
    Push.cx
  4533. Broken Poker
    Push.cx
  4534. TV Setup
    Push.cx
  4535. Google Ad Injection
    Push.cx
  4536. Streaming Weekly Lobsters Office Hours
    Push.cx
  4537. Discord vs IRC Rough Notes
    Push.cx
  4538. Wrapping Large-Scale Refactors
    Push.cx
  4539. NixOS on prgmr and Failing to Learn Nix
    Push.cx
  4540. House Rules
    Push.cx
  4541. ***
    Rafał Pastuszak
  4542. Hummingbirds are Evil! Procrastination, Laziness and Play
    Rafał Pastuszak
  4543. Short: WiP
    Rafał Pastuszak
  4544. Sit.
    Rafał Pastuszak
  4545. Emotive Conjugation
    Rafał Pastuszak
  4546. useRainbow()
    Rafał Pastuszak
  4547. No Such Thing as a Fish
    Rafał Pastuszak
  4548. Short: Retrofuturetrospectives
    Rafał Pastuszak
  4549. Code sober, debug drunk
    Rafał Pastuszak
  4550. Pair Programming with Snakes
    Rafał Pastuszak
  4551. Reactive Hole
    Rafał Pastuszak
  4552. Come and say hi
    Rafał Pastuszak
  4553. Ensō
    Rafał Pastuszak
  4554. I want a good parallel computer
    Raph Levien’s blog
  4555. A note on Metal shader converter
    Raph Levien’s blog
  4556. Simplifying Bézier paths
    Raph Levien’s blog
  4557. Moving from Rust to C++
    Raph Levien’s blog
  4558. Requiem for piet-gpu-hal
    Raph Levien’s blog
  4559. Raph’s reflections and wishes for 2023
    Raph Levien’s blog
  4560. Minikin retrospective
    Raph Levien’s blog
  4561. Parallel curves of cubic Béziers
    Raph Levien’s blog
  4562. Advice for the next dozen Rust GUIs
    Raph Levien’s blog
  4563. Xilem: an architecture for UI in Rust
    Raph Levien’s blog
  4564. Add SSL to your personal website
    Seth Ladd's Blog
  4565. Dynamically load package contents with Dart's new Resource class
    Seth Ladd's Blog
  4566. New Dart SDK helps eliminates symlinks
    Seth Ladd's Blog
  4567. Null-aware operators in Dart
    Seth Ladd's Blog
  4568. Formatting Dart code before every git commit
    Seth Ladd's Blog
  4569. I ported a JavaScript app to Dart. Here's what I learned.
    Seth Ladd's Blog
  4570. Speed Up Your Dart App's Initial Load With This Transformer
    Seth Ladd's Blog
  4571. Angular and Polymer Data Binding, Together!
    Seth Ladd's Blog
  4572. How to shrink the size of your Dart app when compiled to JavaScript
    Seth Ladd's Blog
  4573. Compile-time dead code elimination with dart2js
    Seth Ladd's Blog
  4574. Forms, HTTP servers, and Polymer with Dart
    Seth Ladd's Blog
  4575. JavaZone Report. Spoiler: Awesome.
    Seth Ladd's Blog
  4576. You complete me, unless you already have a Dart future
    Seth Ladd's Blog
  4577. Polymer and Dart: A First Look
    Seth Ladd's Blog
  4578. Two-way data binding with Web UI custom elements and models
    Seth Ladd's Blog
  4579. Dart and Sencha Touch for Mobile Web Apps
    Seth Ladd's Blog
  4580. Create unified interfaces across dart:io and dart:html
    Seth Ladd's Blog
  4581. Call JavaScript from Dart - First Look
    Seth Ladd's Blog
  4582. Forms, HTTP servers, and Web Components with Dart
    Seth Ladd's Blog
  4583. Watch the video from What's New in Dart from Google I/O 2013
    Seth Ladd's Blog
  4584. Lazy Load Libraries in Dart
    Seth Ladd's Blog
  4585. Dynamically Load Code with Dart
    Seth Ladd's Blog
  4586. 6 Dart FAQs - Answered!
    Seth Ladd's Blog
  4587. First Look at Dart Mixins
    Seth Ladd's Blog
  4588. Dart on FLOSS Weekly from TWiT Network
    Seth Ladd's Blog
  4589. Neocities Is Blocked by Bing
    The Neocities Blog
  4590. Cleaner links for web pages
    The Neocities Blog
  4591. File moving/renaming, faster web sites
    The Neocities Blog
  4592. IPFS DNS Support
    The Neocities Blog
  4593. Introducing the Neocities CLI
    The Neocities Blog
  4594. The Net Neutrality Supporters Plan
    The Neocities Blog
  4595. 10x more free space
    The Neocities Blog
  4596. Introducing Neocities Site Tipping
    The Neocities Blog
  4597. We’re switching to default SSL
    The Neocities Blog
  4598. HTTP is obsolete. It’s time for the Distributed Web
    The Neocities Blog
  4599. Comet Ice
    what if?
  4600. Star Ownership
    what if?
  4601. Transatlantic Car Rental
    what if?
  4602. Hailstones
    what if?
  4603. Hot Banana
    what if?
  4604. righteous animation
    Cyberdelia NYC
  4605. RIP Dan McQuade
    Cyberdelia NYC
  4606. the joybubbles of connecting
    Cyberdelia NYC
  4607. party like it’s 1995
    Cyberdelia NYC
  4608. just thought I would let you guys at 2600 know about this
    Cyberdelia NYC
  4609. 30 Years
    Cyberdelia NYC
  4610. Cyberdelia t-shirts
    Cyberdelia NYC
  4611. She Doesn’t Even Go Here!… oh wait, yes she does
    Cyberdelia NYC
  4612. New York Times Archive webpage - August 10th 1988
    Cyberdelia NYC
  4613. Kate Was Right About Computers
    Cyberdelia NYC
  4614. 30
    Dioramas
  4615. 29
    Dioramas
  4616. 28
    Dioramas
  4617. 27
    Dioramas
  4618. 26
    Dioramas
  4619. 25
    Dioramas
  4620. 24
    Dioramas
  4621. 23
    Dioramas
  4622. 22
    Dioramas
  4623. 21
    Dioramas
  4624. untitled
    renee french
  4625. untitled
    renee french
  4626. untitled
    renee french
  4627. untitled
    renee french
  4628. untitled
    renee french
  4629. untitled
    renee french
  4630. untitled
    renee french
  4631. untitled
    renee french
  4632. untitled
    renee french
  4633. untitled
    renee french
  4634. untitled
    renee french
  4635. untitled
    renee french
  4636. untitled
    renee french
  4637. untitled
    renee french
  4638. untitled
    renee french
  4639. untitled
    renee french
  4640. untitled
    renee french
  4641. untitled
    renee french
  4642. untitled
    renee french
  4643. untitled
    renee french
  4644. untitled
    renee french
  4645. untitled
    renee french
  4646. untitled
    renee french
  4647. untitled
    renee french
  4648. untitled
    renee french
  4649. The Clues Come Together in the Chaos (WUBRG Drafting)
    Mediocre Magic 2026-05-15T01:21:48+00:00
  4650. Recency Bias
    Vittles 2026-05-15T08:07:11+00:00
  4651. I am your dentist (mathemachicken)
    Warp Door 2026-05-15T10:37:02+00:00
  4652. That’s So Cincinnati
    Midwesterner 2026-05-15T11:01:26+00:00
  4653. untitled
    https://jennifermillsnews.tumblr.com/ 2026-05-15T12:28:41+00:00
  4654. pennis (Sagun, Denny)
    Warp Door 2026-05-15T12:54:36+00:00
  4655. New and Old #266
    The Deleted Scenes 2026-05-15T12:56:05+00:00
  4656. The chopped Gym Shoe
    Food is Stupid 2026-05-15T13:00:56+00:00
  4657. untitled
    HORSEPUSSY GALORE 2026-05-15T14:13:46+00:00
  4658. untitled
    HORSEPUSSY GALORE 2026-05-15T14:16:17+00:00
  4659. untitled
    HORSEPUSSY GALORE 2026-05-15T14:18:19+00:00
  4660. untitled
    HORSEPUSSY GALORE 2026-05-15T14:47:39+00:00
  4661. The Dragon’s Eye
    FYFD 2026-05-15T15:00:00+00:00
  4662. Premium: What If...We're In An AI Bubble? (Part 1)
    Ed Zitron's Where's Your Ed At 2026-05-15T16:44:27+00:00
  4663. CONDOR
    Weird Fucking Games 2026-05-15T18:16:25+00:00
  4664. Filtered for bad addresses and good emotions
    Interconnected 2026-05-15T19:10:00+00:00

Fsyncgate: errors on fsync are unrecovarable

Dan Luu

source

<p><small><em>This is an archive of the original &quot;fsyncgate&quot; email thread. This is posted here because I wanted to have a link that would fit on a slide for <a href="http://danluu.com/deconstruct-files/">a talk on file safety</a> with <a href="http://danluu.com/web-bloat/">a mobile-friendly non-bloated format</a>.</em></small></p> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Subject:Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS Date:2018-03-28 02:23:46 </code></pre> <p>Hi all</p> <p>Some time ago I ran into an issue where a user encountered data corruption after a storage error. PostgreSQL played a part in that corruption by allowing checkpoint what should've been a fatal error.</p> <p>TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> <p>Pg wrote some blocks, which went to OS dirty buffers for writeback. Writeback failed due to an underlying storage error. The block I/O layer and XFS marked the writeback page as failed (AS_EIO), but had no way to tell the app about the failure. When Pg called fsync() on the FD during the next checkpoint, fsync() returned EIO because of the flagged page, to tell Pg that a previous async write failed. Pg treated the checkpoint as failed and didn't advance the redo start position in the control file.</p> <p>All good so far.</p> <p>But then we retried the checkpoint, which retried the fsync(). The retry succeeded, because the prior fsync() <em>cleared the AS_EIO bad page flag</em>.</p> <p>The write never made it to disk, but we completed the checkpoint, and merrily carried on our way. Whoops, data loss.</p> <p>The clear-error-and-continue behaviour of fsync is not documented as far as I can tell. Nor is fsync() returning EIO unless you have a very new linux man-pages with the patch I wrote to add it. But from what I can see in the POSIX standard we are not given any guarantees about what happens on fsync() failure at all, so we're probably wrong to assume that retrying fsync( ) is safe.</p> <p>If the server had been using ext3 or ext4 with errors=remount-ro, the problem wouldn't have occurred because the first I/O error would've remounted the FS and stopped Pg from continuing. But XFS doesn't have that option. There may be other situations where this can occur too, involving LVM and/or multipath, but I haven't comprehensively dug out the details yet.</p> <p>It proved possible to recover the system by faking up a backup label from before the first incorrectly-successful checkpoint, forcing redo to repeat and write the lost blocks. But ... what a mess.</p> <p>I posted about the underlying fsync issue here some time ago:</p> <p><a href="https://stackoverflow.com/q/42434872/398670">https://stackoverflow.com/q/42434872/398670</a></p> <p>but haven't had a chance to follow up about the Pg specifics.</p> <p>I've been looking at the problem on and off and haven't come up with a good answer. I think we should just PANIC and let redo sort it out by repeating the failed write when it repeats work since the last checkpoint.</p> <p>The API offered by async buffered writes and fsync offers us no way to find out which page failed, so we can't just selectively redo that write. I think we do know the relfilenode associated with the fd that failed to fsync, but not much more. So the alternative seems to be some sort of potentially complex online-redo scheme where we replay WAL only the relation on which we had the fsync() error, while otherwise servicing queries normally. That's likely to be extremely error-prone and hard to test, and it's trying to solve a case where on other filesystems the whole DB would grind to a halt anyway.</p> <p>I looked into whether we can solve it with use of the AIO API instead, but the mess is even worse there - from what I can tell you can't even reliably guarantee fsync at all on all Linux kernel versions.</p> <p>We already PANIC on fsync() failure for WAL segments. We just need to do the same for data forks at least for EIO. This isn't as bad as it seems because AFAICS fsync only returns EIO in cases where we should be stopping the world anyway, and many FSes will do that for us.</p> <p>There are rather a lot of pg_fsync() callers. While we could handle this case-by-case for each one, I'm tempted to just make pg_fsync() itself intercept EIO and PANIC. Thoughts?</p> <hr /> <pre><code>From:Tom Lane &lt;tgl(at)sss(dot)pgh(dot)pa(dot)us&gt; Date:2018-03-28 03:53:08 </code></pre> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> <blockquote> <p>Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> </blockquote> <p>If that's actually the case, we need to push back on this kernel brain damage, because as you're describing it fsync would be completely useless.</p> <p>Moreover, POSIX is entirely clear that successful fsync means all preceding writes for the file have been completed, full stop, doesn't matter when they were issued.</p> <hr /> <pre><code>From:Michael Paquier &lt;michael(at)paquier(dot)xyz&gt; Date:2018-03-29 02:30:59 </code></pre> <p>On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>Any callers of pg_fsync in the backend code are careful enough to check the returned status, sometimes doing retries like in mdsync, so what is proposed here would be a regression.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-03-29 02:48:27 </code></pre> <p>On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier wrote:</p> <blockquote> <p>On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>Any callers of pg_fsync in the backend code are careful enough to check the returned status, sometimes doing retries like in mdsync, so what is proposed here would be a regression.</p> </blockquote> <p>Craig, is the phenomenon you described the same as the second issue &quot;Reporting writeback errors&quot; discussed in this article?</p> <p><a href="https://lwn.net/Articles/724307/">https://lwn.net/Articles/724307/</a></p> <p>&quot;Current kernels might report a writeback error on an fsync() call, but there are a number of ways in which that can fail to happen.&quot;</p> <p>That's... I'm speechless.</p> <hr /> <pre><code>From:Justin Pryzby &lt;pryzby(at)telsasoft(dot)com&gt; Date:2018-03-29 05:00:31 </code></pre> <p>On Thu, Mar 29, 2018 at 11:30:59AM +0900, Michael Paquier wrote:</p> <blockquote> <p>On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>Any callers of pg_fsync in the backend code are careful enough to check the returned status, sometimes doing retries like in mdsync, so what is proposed here would be a regression.</p> </blockquote> <p>The retries are the source of the problem ; the first fsync() can return EIO, and also <em>clears the error</em> causing a 2nd fsync (of the same data) to return success.</p> <p>(Note, I can see that it might be useful to PANIC on EIO but retry for ENOSPC).</p> <p>On Thu, Mar 29, 2018 at 03:48:27PM +1300, Thomas Munro wrote:</p> <blockquote> <p>Craig, is the phenomenon you described the same as the second issue &quot;Reporting writeback errors&quot; discussed in this article? <a href="https://lwn.net/Articles/724307/">https://lwn.net/Articles/724307/</a></p> </blockquote> <p>Worse, the article acknowledges the behavior without apparently suggesting to change it:</p> <p>&quot;Storing that value in the file structure has an important benefit: it makes it possible to report a writeback error EXACTLY ONCE TO EVERY PROCESS THAT CALLS FSYNC() .... In current kernels, ONLY THE FIRST CALLER AFTER AN ERROR OCCURS HAS A CHANCE OF SEEING THAT ERROR INFORMATION.&quot;</p> <p>I believe I reproduced the problem behavior using dmsetup &quot;error&quot; target, see attached.</p> <p>strace looks like this:</p> <p>kernel is Linux 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux</p> <pre><code>1open(&quot;/dev/mapper/eio&quot;, O_RDWR|O_CREAT, 0600) = 3 2write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 3write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 4write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 5write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 6write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 7write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 8192 8write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = 2560 9write(3, &quot;\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0&quot;..., 8192) = -1 ENOSPC (No space left on device) 10dup(2) = 4 11fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE) 12brk(NULL) = 0x1299000 13brk(0x12ba000) = 0x12ba000 14fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 15write(4, &quot;write(1): No space left on devic&quot;..., 34write(1): No space left on device 16) = 34 17close(4) = 0 18fsync(3) = -1 EIO (Input/output error) 19dup(2) = 4 20fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE) 21fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 22write(4, &quot;fsync(1): Input/output error\n&quot;, 29fsync(1): Input/output error 23) = 29 24close(4) = 0 25close(3) = 0 26open(&quot;/dev/mapper/eio&quot;, O_RDWR|O_CREAT, 0600) = 3 27fsync(3) = 0 28write(3, &quot;\0&quot;, 1) = 1 29fsync(3) = 0 30exit_group(0) = ? </code></pre> <p>2: EIO isn't seen initially due to writeback page cache;</p> <p>9: ENOSPC due to small device</p> <p>18: original IO error reported by fsync, good</p> <p>25: the original FD is closed</p> <p>26: ..and file reopened</p> <p>27: fsync on file with still-dirty data+EIO returns success BAD</p> <p>10, 19: I'm not sure why there's dup(2), I guess glibc thinks that perror should write to a separate FD (?)</p> <p>Also note, close() ALSO returned success..which you might think exonerates the 2nd fsync(), but I think may itself be problematic, no? In any case, the 2nd byte certainly never got written to DM error, and the failure status was lost following fsync().</p> <p>I get the exact same behavior if I break after one write() loop, such as to avoid ENOSPC.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-03-29 05:06:22 </code></pre> <p>On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby wrote:</p> <blockquote> <p>The retries are the source of the problem ; the first fsync() can return EIO, and also <em>clears the error</em> causing a 2nd fsync (of the same data) to return success.</p> </blockquote> <p>What I'm failing to grok here is how that error flag even matters, whether it's a single bit or a counter as described in that patch. If write back failed, <em>the page is still dirty</em>. So all future calls to fsync() need to try to try to flush it again, and (presumably) fail again (unless it happens to succeed this time around).</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-29 05:25:51 </code></pre> <p>On 29 March 2018 at 13:06, Thomas Munro wrote:</p> <blockquote> <p>On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby wrote:</p> <blockquote> <p>The retries are the source of the problem ; the first fsync() can return EIO, and also <em>clears the error</em> causing a 2nd fsync (of the same data) to return success.</p> </blockquote> <p>What I'm failing to grok here is how that error flag even matters, whether it's a single bit or a counter as described in that patch. If write back failed, <em>the page is still dirty</em>. So all future calls to fsync() need to try to try to flush it again, and (presumably) fail again (unless it happens to succeed this time around). <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a></p> </blockquote> <p>You'd think so. But it doesn't appear to work that way. You can see yourself with the error device-mapper destination mapped over part of a volume.</p> <p>I wrote a test case here.</p> <p><a href="https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c">https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c</a></p> <p>I don't pretend the kernel behaviour is sane. And it's possible I've made an error in my analysis. But since I've observed this in the wild, and seen it in a test case, I strongly suspect that's what I've described is just what's happening, brain-dead or no.</p> <p>Presumably the kernel marks the page clean when it dispatches it to the I/O subsystem and doesn't dirty it again on I/O error? I haven't dug that deep on the kernel side. See the stackoverflow post for details on what I found in kernel code analysis.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-29 05:32:43 </code></pre> <p>On 29 March 2018 at 10:48, Thomas Munro wrote:</p> <blockquote> <p>On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier wrote:</p> <blockquote> <p>On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>Any callers of pg_fsync in the backend code are careful enough to check the returned status, sometimes doing retries like in mdsync, so what is proposed here would be a regression.</p> </blockquote> <p>Craig, is the phenomenon you described the same as the second issue &quot;Reporting writeback errors&quot; discussed in this article?</p> <p><a href="https://lwn.net/Articles/724307/">https://lwn.net/Articles/724307/</a></p> </blockquote> <p>A variant of it, by the looks.</p> <p>The problem in our case is that the kernel only tells us about the error once. It then forgets about it. So yes, that seems like a variant of the statement:</p> <blockquote> <p>&quot;Current kernels might report a writeback error on an fsync() call, but there are a number of ways in which that can fail to happen.&quot;</p> <p>That's... I'm speechless.</p> </blockquote> <p>Yeah.</p> <p>It's a bit nuts.</p> <p>I was astonished when I saw the behaviour, and that it appears undocumented.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-29 05:35:47 </code></pre> <p>On 29 March 2018 at 10:30, Michael Paquier wrote:</p> <blockquote> <p>On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>Any callers of pg_fsync in the backend code are careful enough to check the returned status, sometimes doing retries like in mdsync, so what is proposed here would be a regression.</p> </blockquote> <p>I covered this in my original post.</p> <p>Yes, we check the return value. But what do we do about it? For fsyncs of heap files, we ERROR, aborting the checkpoint. We'll retry the checkpoint later, which will retry the fsync(). <strong>Which will now appear to succeed</strong> because the kernel forgot that it lost our writes after telling us the first time. So we do check the error code, which returns success, and we complete the checkpoint and move on.</p> <p>But we only retried the fsync, not the writes before the fsync.</p> <p>So we lost data. Or rather, failed to detect that the kernel did so, so our checkpoint was bad and could not be completed.</p> <p>The problem is that we keep retrying checkpoints <em>without</em> repeating the writes leading up to the checkpoint, and retrying fsync.</p> <p>I don't pretend the kernel behaviour is sane, but we'd better deal with it anyway.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-29 05:58:45 </code></pre> <p>On 28 March 2018 at 11:53, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as well to avoid similar lost-page-write issues.</p> <p>It's not necessary on ext3/ext4 with errors=remount-ro, but that's only because the FS stops us dead in our tracks.</p> <p>I don't pretend it's sane. The kernel behaviour is IMO crazy. If it's going to lose a write, it should at minimum mark the FD as broken so no further fsync() or anything else can succeed on the FD, and an app that cares about durability must repeat the whole set of work since the prior succesful fsync(). Just reporting it once and forgetting it is madness.</p> <p>But even if we convince the kernel folks of that, how do other platforms behave? And how long before these kernels are out of use? We'd better deal with it, crazy or no.</p> <p>Please see my StackOverflow post for the kernel-level explanation. Note also the test case link there. <a href="https://stackoverflow.com/a/42436054/398670">https://stackoverflow.com/a/42436054/398670</a></p> <blockquote> <blockquote> <p>Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> </blockquote> <p>If that's actually the case, we need to push back on this kernel brain damage, because as you're describing it fsync would be completely useless.</p> </blockquote> <p>It's not useless, it's just telling us something other than what we think it means. The promise it seems to give us is that if it reports an error once, everything <em>after</em> that is useless, so we should throw our toys, close and reopen everything, and redo from the last known-good state.</p> <p>Though as Tomas posted below, it provides rather weaker guarantees than I thought in some other areas too. See that lwn.net article he linked.</p> <blockquote> <p>Moreover, POSIX is entirely clear that successful fsync means all preceding writes for the file have been completed, full stop, doesn't matter when they were issued.</p> </blockquote> <p>I can't find anything that says so to me. Please quote relevant spec.</p> <p>I'm working from <a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html">http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html</a> which states that</p> <p>&quot;The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes. The nature of the transfer is implementation-defined. The fsync() function shall not return until the system has completed that action or until an error is detected.&quot;</p> <p>My reading is that POSIX does not specify what happens AFTER an error is detected. It doesn't say that error has to be persistent and that subsequent calls must also report the error. It also says:</p> <p>&quot;If the fsync() function fails, outstanding I/O operations are not guaranteed to have been completed.&quot;</p> <p>but that doesn't clarify matters much either, because it can be read to mean that once there's been an error reported for some IO operations there's no guarantee those operations are ever completed even after a subsequent fsync returns success.</p> <p>I'm not seeking to defend what the kernel seems to be doing. Rather, saying that we might see similar behaviour on other platforms, crazy or not. I haven't looked past linux yet, though.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-03-29 12:07:56 </code></pre> <p>On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer wrote:</p> <blockquote> <p>On 28 March 2018 at 11:53, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as well to avoid similar lost-page-write issues.</p> </blockquote> <p>I found your discussion with kernel hacker Jeff Layton at <a href="https://lwn.net/Articles/718734/">https://lwn.net/Articles/718734/</a> in which he said: &quot;The stackoverflow writeup seems to want a scheme where pages stay dirty after a writeback failure so that we can try to fsync them again. Note that that has never been the case in Linux after hard writeback failures, AFAIK, so programs should definitely not assume that behavior.&quot;</p> <p>The article above that says the same thing a couple of different ways, ie that writeback failure leaves you with pages that are neither written to disk successfully nor marked dirty.</p> <p>If I'm reading various articles correctly, the situation was even worse before his errseq_t stuff landed. That fixed cases of completely unreported writeback failures due to sharing of PG_error for both writeback and read errors with certain filesystems, but it doesn't address the clean pages problem.</p> <p>Yeah, I see why you want to PANIC.</p> <blockquote> <blockquote> <p>Moreover, POSIX is entirely clear that successful fsync means all preceding writes for the file have been completed, full stop, doesn't matter when they were issued.</p> </blockquote> <p>I can't find anything that says so to me. Please quote relevant spec.</p> <p>I'm working from <a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html">http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html</a> which states that</p> <p>&quot;The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes. The nature of the transfer is implementation-defined. The fsync() function shall not return until the system has completed that action or until an error is detected.&quot;</p> <p>My reading is that POSIX does not specify what happens AFTER an error is detected. It doesn't say that error has to be persistent and that subsequent calls must also report the error. It also says:</p> </blockquote> <p>FWIW my reading is the same as Tom's. It says &quot;all data for the open file descriptor&quot; without qualification or special treatment after errors. Not &quot;some&quot;.</p> <blockquote> <p>I'm not seeking to defend what the kernel seems to be doing. Rather, saying that we might see similar behaviour on other platforms, crazy or not. I haven't looked past linux yet, though.</p> </blockquote> <p>I see no reason to think that any other operating system would behave that way without strong evidence... This is openly acknowledged to be &quot;a mess&quot; and &quot;a surprise&quot; in the Filesystem Summit article. I am not really qualified to comment, but from a cursory glance at FreeBSD's vfs_bio.c I think it's doing what you'd hope for... see the code near the comment &quot;Failed write, redirty.&quot;</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-29 13:15:10 </code></pre> <p>On 29 March 2018 at 20:07, Thomas Munro wrote:</p> <blockquote> <p>On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer wrote:</p> <blockquote> <p>On 28 March 2018 at 11:53, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return.</p> </blockquote> <p>Surely you jest.</p> </blockquote> <p>No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as well to avoid similar lost-page-write issues.</p> </blockquote> <p>I found your discussion with kernel hacker Jeff Layton at <a href="https://lwn.net/Articles/718734/">https://lwn.net/Articles/718734/</a> in which he said: &quot;The stackoverflow writeup seems to want a scheme where pages stay dirty after a writeback failure so that we can try to fsync them again. Note that that has never been the case in Linux after hard writeback failures, AFAIK, so programs should definitely not assume that behavior.&quot;</p> <p>The article above that says the same thing a couple of different ways, ie that writeback failure leaves you with pages that are neither written to disk successfully nor marked dirty.</p> <p>If I'm reading various articles correctly, the situation was even worse before his errseq_t stuff landed. That fixed cases of completely unreported writeback failures due to sharing of PG_error for both writeback and read errors with certain filesystems, but it doesn't address the clean pages problem.</p> <p>Yeah, I see why you want to PANIC.</p> </blockquote> <p>In more ways than one ;)</p> <blockquote> <p>I'm not seeking to defend what the kernel seems to be doing. Rather, saying</p> <blockquote> <p>that we might see similar behaviour on other platforms, crazy or not. I haven't looked past linux yet, though.</p> </blockquote> <p>I see no reason to think that any other operating system would behave that way without strong evidence... This is openly acknowledged to be &quot;a mess&quot; and &quot;a surprise&quot; in the Filesystem Summit article. I am not really qualified to comment, but from a cursory glance at FreeBSD's vfs_bio.c I think it's doing what you'd hope for... see the code near the comment &quot;Failed write, redirty.&quot;</p> </blockquote> <p>Ok, that's reassuring, but doesn't help us on the platform the great majority of users deploy on :(</p> <p>&quot;If on Linux, PANIC&quot;</p> <p>Hrm.</p> <hr /> <pre><code>From:Catalin Iacob &lt;iacobcatalin(at)gmail(dot)com&gt; Date:2018-03-29 16:20:00 </code></pre> <p>On Thu, Mar 29, 2018 at 2:07 PM, Thomas Munro wrote:</p> <blockquote> <p>I found your discussion with kernel hacker Jeff Layton at <a href="https://lwn.net/Articles/718734/">https://lwn.net/Articles/718734/</a> in which he said: &quot;The stackoverflow writeup seems to want a scheme where pages stay dirty after a writeback failure so that we can try to fsync them again. Note that that has never been the case in Linux after hard writeback failures, AFAIK, so programs should definitely not assume that behavior.&quot;</p> </blockquote> <p>And a bit below in the same comments, to this question about PG: &quot;So, what are the options at this point? The assumption was that we can repeat the fsync (which as you point out is not the case), or shut down the database and perform recovery from WAL&quot;, the same Jeff Layton seems to agree PANIC is the appropriate response: &quot;Replaying the WAL synchronously sounds like the simplest approach when you get an error on fsync. These are uncommon occurrences for the most part, so having to fall back to slow, synchronous error recovery modes when this occurs is probably what you want to do.&quot;. And right after, he confirms the errseq_t patches are about always detecting this, not more: &quot;The main thing I working on is to better guarantee is that you actually get an error when this occurs rather than silently corrupting your data. The circumstances where that can occur require some corner-cases, but I think we need to make sure that it doesn't occur.&quot;</p> <p>Jeff's comments in the pull request that merged errseq_t are worth reading as well: <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750</a></p> <blockquote> <p>The article above that says the same thing a couple of different ways, ie that writeback failure leaves you with pages that are neither written to disk successfully nor marked dirty.</p> <p>If I'm reading various articles correctly, the situation was even worse before his errseq_t stuff landed. That fixed cases of completely unreported writeback failures due to sharing of PG_error for both writeback and read errors with certain filesystems, but it doesn't address the clean pages problem.</p> </blockquote> <p>Indeed, that's exactly how I read it as well (opinion formed independently before reading your sentence above). The errseq_t patches landed in v4.13 by the way, so very recently.</p> <blockquote> <p>Yeah, I see why you want to PANIC.</p> </blockquote> <p>Indeed. Even doing that leaves question marks about all the kernel versions before v4.13, which at this point is pretty much everything out there, not even detecting this reliably. This is messy.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-03-29 21:18:14 </code></pre> <p>On Fri, Mar 30, 2018 at 5:20 AM, Catalin Iacob wrote:</p> <blockquote> <p>Jeff's comments in the pull request that merged errseq_t are worth reading as well: <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750</a></p> </blockquote> <p>Wow. It looks like there may be a separate question of when each filesystem adopted this new infrastructure?</p> <blockquote> <blockquote> <p>Yeah, I see why you want to PANIC.</p> </blockquote> <p>Indeed. Even doing that leaves question marks about all the kernel versions before v4.13, which at this point is pretty much everything out there, not even detecting this reliably. This is messy.</p> </blockquote> <p>The pre-errseq_t problems are beyond our control. There's nothing we can do about that in userspace (except perhaps abandon OS-buffered IO, a big project). We just need to be aware that this problem exists in certain kernel versions and be grateful to Layton for fixing it.</p> <p>The dropped dirty flag problem is something we can and in my view should do something about, whatever we might think about that design choice. As Andrew Gierth pointed out to me in an off-list chat about this, by the time you've reached this state, both PostgreSQL's buffer and the kernel's buffer are clean and might be reused for another block at any time, so your data might be gone from the known universe -- we don't even have the option to rewrite our buffers in general. Recovery is the only option.</p> <p>Thank you to Craig for chasing this down and +1 for his proposal, on Linux only.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-03-31 13:24:28 </code></pre> <p>On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:</p> <blockquote> <blockquote> <blockquote> <p>Yeah, I see why you want to PANIC.</p> </blockquote> <p>Indeed. Even doing that leaves question marks about all the kernel versions before v4.13, which at this point is pretty much everything out there, not even detecting this reliably. This is messy.</p> </blockquote> </blockquote> <p>There may still be a way to reliably detect this on older kernel versions from userspace, but it will be messy whatsoever. On EIO errors, the kernel will not restore the dirty page flags, but it will flip the error flags on the failed pages. One could mmap() the file in question, obtain the PFNs (via /proc/pid/pagemap) and enumerate those to match the ones with the error flag switched on (via /proc/kpageflags). This could serve at least as a detection mechanism, but one could also further use this info to logically map the pages that failed IO back to the original file offsets, and potentially retry IO just for those file ranges that cover the failed pages. Just an idea, not tested.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-03-31 16:13:09 </code></pre> <p>On 31 March 2018 at 21:24, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:</p> <blockquote> <blockquote> <blockquote> <p>Yeah, I see why you want to PANIC.</p> </blockquote> <p>Indeed. Even doing that leaves question marks about all the kernel versions before v4.13, which at this point is pretty much everything out there, not even detecting this reliably. This is messy.</p> </blockquote> </blockquote> <p>There may still be a way to reliably detect this on older kernel versions from userspace, but it will be messy whatsoever. On EIO errors, the kernel will not restore the dirty page flags, but it will flip the error flags on the failed pages. One could mmap() the file in question, obtain the PFNs (via /proc/pid/pagemap) and enumerate those to match the ones with the error flag switched on (via /proc/kpageflags). This could serve at least as a detection mechanism, but one could also further use this info to logically map the pages that failed IO back to the original file offsets, and potentially retry IO just for those file ranges that cover the failed pages. Just an idea, not tested.</p> </blockquote> <p>That sounds like a huge amount of complexity, with uncertainty as to how it'll behave kernel-to-kernel, for negligble benefit.</p> <p>I was exploring the idea of doing selective recovery of one relfilenode, based on the assumption that we know the filenode related to the fd that failed to fsync(). We could redo only WAL on that relation. But it fails the same test: it's too complex for a niche case that shouldn't happen in the first place, so it'll probably have bugs, or grow bugs in bitrot over time.</p> <p>Remember, if you're on ext4 with errors=remount-ro, you get shut down even harder than a PANIC. So we should just use the big hammer here.</p> <p>I'll send a patch this week.</p> <hr /> <pre><code>From:Tom Lane &lt;tgl(at)sss(dot)pgh(dot)pa(dot)us&gt; Date:2018-03-31 16:38:12 </code></pre> <p>Craig Ringer writes:</p> <blockquote> <p>So we should just use the big hammer here.</p> </blockquote> <p>And bitch, loudly and publicly, about how broken this kernel behavior is. If we make enough of a stink maybe it'll get fixed.</p> <hr /> <pre><code>From:Michael Paquier &lt;michael(at)paquier(dot)xyz&gt; Date:2018-04-01 00:20:38 </code></pre> <p>On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>So we should just use the big hammer here.</p> </blockquote> <p>And bitch, loudly and publicly, about how broken this kernel behavior is. If we make enough of a stink maybe it'll get fixed.</p> </blockquote> <p>That won't fix anything released already, so as per the information gathered something has to be done anyway. The discussion of this thread is spreading quite a lot actually.</p> <p>Handling things at a low-level looks like a better plan for the backend. Tools like pg_basebackup and pg_dump also issue fsync's on the data created, we should do an equivalent for them, with some exit() calls in file_utils.c. As of now failures are logged to stderr but not considered fatal.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-01 00:58:22 </code></pre> <p>On Sun, Apr 01, 2018 at 12:13:09AM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 31 March 2018 at 21:24, Anthony Iliopoulos &lt;[1]ailiop(at)altatus(dot)com&gt; wrote:</p> <pre><code> On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote: &gt; &gt;&gt; Yeah, I see why you want to PANIC. &gt; &gt; &gt; &gt; Indeed. Even doing that leaves question marks about all the kernel &gt; &gt; versions before v4.13, which at this point is pretty much everything &gt; &gt; out there, not even detecting this reliably. This is messy. </code></pre> <blockquote> <pre><code> There may still be a way to reliably detect this on older kernel versions from userspace, but it will be messy whatsoever. On EIO errors, the kernel will not restore the dirty page flags, but it will flip the error flags on the failed pages. One could mmap() the file in question, obtain the PFNs (via /proc/pid/pagemap) and enumerate those to match the ones with the error flag switched on (via /proc/kpageflags). This could serve at least as a detection mechanism, but one could also further use this info to logically map the pages that failed IO back to the original file offsets, and potentially retry IO just for those file ranges that cover the failed pages. Just an idea, not tested. </code></pre> </blockquote> <p>That sounds like a huge amount of complexity, with uncertainty as to how it'll behave kernel-to-kernel, for negligble benefit.</p> </blockquote> <p>Those interfaces have been around since the kernel 2.6 times and are rather stable, but I was merely responding to your original post comment regarding having a way of finding out which page(s) failed. I assume that indeed there would be no benefit, especially since those errors are usually not transient (typically they come from hard medium faults), and although a filesystem could theoretically mask the error by allocating a different logical block, I am not aware of any implementation that currently does that.</p> <blockquote> <p>I was exploring the idea of doing selective recovery of one relfilenode, based on the assumption that we know the filenode related to the fd that failed to fsync(). We could redo only WAL on that relation. But it fails the same test: it's too complex for a niche case that shouldn't happen in the first place, so it'll probably have bugs, or grow bugs in bitrot over time.</p> </blockquote> <p>Fully agree, those cases should be sufficiently rare that a complex and possibly non-maintainable solution is not really warranted.</p> <blockquote> <p>Remember, if you're on ext4 with errors=remount-ro, you get shut down even harder than a PANIC. So we should just use the big hammer here.</p> </blockquote> <p>I am not entirely sure what you mean here, does Pg really treat write() errors as fatal? Also, the kind of errors that ext4 detects with this option is at the superblock level and govern metadata rather than actual data writes (recall that those are buffered anyway, no actual device IO has to take place at the time of write()).</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-01 01:14:46 </code></pre> <p>On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>So we should just use the big hammer here.</p> </blockquote> <p>And bitch, loudly and publicly, about how broken this kernel behavior is. If we make enough of a stink maybe it'll get fixed.</p> </blockquote> <p>It is not likely to be fixed (beyond what has been done already with the manpage patches and errseq_t fixes on the reporting level). The issue is, the kernel needs to deal with hard IO errors at that level somehow, and since those errors typically persist, re-dirtying the pages would not really solve the problem (unless some filesystem remaps the request to a different block, assuming the device is alive). Keeping around dirty pages that cannot possibly be written out is essentially a memory leak, as those pages would stay around even after the application has exited.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-01 18:24:51 </code></pre> <p>On Fri, Mar 30, 2018 at 10:18 AM, Thomas Munro wrote:</p> <blockquote> <p>... on Linux only.</p> </blockquote> <p>Apparently I was too optimistic. I had looked only at FreeBSD, which keeps the page around and dirties it so we can retry, but the other BSDs apparently don't (FreeBSD changed that in 1999). From what I can tell from the sources below, we have:</p> <pre><code>Linux, OpenBSD, NetBSD: retrying fsync() after EIO lies FreeBSD, Illumos: retrying fsync() after EIO tells the truth </code></pre> <p>Maybe my drive-by assessment of those kernel routines is wrong and someone will correct me, but I'm starting to think you might be better to assume the worst on all systems. Perhaps a GUC that defaults to panicking, so that users on those rare OSes could turn that off? Even then I'm not sure if the failure mode will be that great anyway or if it's worth having two behaviours. Thoughts?</p> <p><a href="http://mail-index.netbsd.org/netbsd-users/2018/03/30/msg020576.html">http://mail-index.netbsd.org/netbsd-users/2018/03/30/msg020576.html</a> <a href="https://github.com/NetBSD/src/blob/trunk/sys/kern/vfs_bio.c#L1059">https://github.com/NetBSD/src/blob/trunk/sys/kern/vfs_bio.c#L1059</a> <a href="https://github.com/openbsd/src/blob/master/sys/kern/vfs_bio.c#L867">https://github.com/openbsd/src/blob/master/sys/kern/vfs_bio.c#L867</a> <a href="https://github.com/freebsd/freebsd/blob/master/sys/kern/vfs_bio.c#L2631">https://github.com/freebsd/freebsd/blob/master/sys/kern/vfs_bio.c#L2631</a> <a href="https://github.com/freebsd/freebsd/commit/e4e8fec98ae986357cdc208b04557dba55a59266">https://github.com/freebsd/freebsd/commit/e4e8fec98ae986357cdc208b04557dba55a59266</a> <a href="https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/bio.c#L441">https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/bio.c#L441</a></p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-02 15:03:42 </code></pre> <p>On 2 April 2018 at 02:24, Thomas Munro wrote:</p> <blockquote> <p>Maybe my drive-by assessment of those kernel routines is wrong and someone will correct me, but I'm starting to think you might be better to assume the worst on all systems. Perhaps a GUC that defaults to panicking, so that users on those rare OSes could turn that off? Even then I'm not sure if the failure mode will be that great anyway or if it's worth having two behaviours. Thoughts?</p> </blockquote> <p>I see little benefit to not just PANICing unconditionally on EIO, really. It shouldn't happen, and if it does, we want to be pretty conservative and adopt a data-protective approach.</p> <p>I'm rather more worried by doing it on ENOSPC. Which looks like it might be necessary from what I recall finding in my test case + kernel code reading. I really don't want to respond to a possibly-transient ENOSPC by PANICing the whole server unnecessarily.</p> <p>BTW, the support team at 2ndQ is presently working on two separate issues where ENOSPC resulted in DB corruption, though neither of them involve logs of lost page writes. I'm planning on taking some time tomorrow to write a torture tester for Pg's ENOSPC handling and to verify ENOSPC handling in the test case I linked to in my original StackOverflow post.</p> <p>If this is just an EIO issue then I see no point doing anything other than PANICing unconditionally.</p> <p>If it's a concern for ENOSPC too, we should try harder to fail more nicely whenever we possibly can.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-02 18:13:46 </code></pre> <p>Hi,</p> <p>On 2018-04-01 03:14:46 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>So we should just use the big hammer here.</p> </blockquote> <p>And bitch, loudly and publicly, about how broken this kernel behavior is. If we make enough of a stink maybe it'll get fixed.</p> </blockquote> <p>It is not likely to be fixed (beyond what has been done already with the manpage patches and errseq_t fixes on the reporting level). The issue is, the kernel needs to deal with hard IO errors at that level somehow, and since those errors typically persist, re-dirtying the pages would not really solve the problem (unless some filesystem remaps the request to a different block, assuming the device is alive).</p> </blockquote> <p>Throwing away the dirty pages <em>and</em> persisting the error seems a lot more reasonable. Then provide a fcntl (or whatever) extension that can clear the error status in the few cases that the application that wants to gracefully deal with the case.</p> <blockquote> <p>Keeping around dirty pages that cannot possibly be written out is essentially a memory leak, as those pages would stay around even after the application has exited.</p> </blockquote> <p>Why do dirty pages need to be kept around in the case of persistent errors? I don't think the lack of automatic recovery in that case is what anybody is complaining about. It's that the error goes away and there's no reasonable way to separate out such an error from some potential transient errors.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-02 18:53:20 </code></pre> <p>On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-01 03:14:46 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:</p> <blockquote> <p>Craig Ringer writes:</p> <blockquote> <p>So we should just use the big hammer here.</p> </blockquote> <p>And bitch, loudly and publicly, about how broken this kernel behavior is. If we make enough of a stink maybe it'll get fixed.</p> </blockquote> <p>It is not likely to be fixed (beyond what has been done already with the manpage patches and errseq_t fixes on the reporting level). The issue is, the kernel needs to deal with hard IO errors at that level somehow, and since those errors typically persist, re-dirtying the pages would not really solve the problem (unless some filesystem remaps the request to a different block, assuming the device is alive).</p> </blockquote> <p>Throwing away the dirty pages <em>and</em> persisting the error seems a lot more reasonable. Then provide a fcntl (or whatever) extension that can clear the error status in the few cases that the application that wants to gracefully deal with the case.</p> </blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> <p>The error reporting is thus consistent with the intended semantics (which are sadly not properly documented). Repeated calls to fsync() simply do not imply that the kernel will retry to writeback the previously-failed pages, so the application needs to be aware of that. Persisting the error at the fsync() level would essentially mean moving application policy into the kernel.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-02 19:32:45 </code></pre> <p>On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:</p> <blockquote> <p>Throwing away the dirty pages <em>and</em> persisting the error seems a lot more reasonable. Then provide a fcntl (or whatever) extension that can clear the error status in the few cases that the application that wants to gracefully deal with the case.</p> </blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> </blockquote> <p>Meh^2.</p> <p>&quot;no reason&quot; - except that there's absolutely no way to know what state the data is in. And that your application needs explicit handling of such failures. And that one FD might be used in a lots of different parts of the application, that fsyncs in one part of the application might be an ok failure, and in another not. Requiring explicit actions to acknowledge &quot;we've thrown away your data for unknown reason&quot; seems entirely reasonable.</p> <blockquote> <p>The error reporting is thus consistent with the intended semantics (which are sadly not properly documented). Repeated calls to fsync() simply do not imply that the kernel will retry to writeback the previously-failed pages, so the application needs to be aware of that.</p> </blockquote> <p>Which isn't what I've suggested.</p> <blockquote> <p>Persisting the error at the fsync() level would essentially mean moving application policy into the kernel.</p> </blockquote> <p>Meh.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-02 20:38:06 </code></pre> <p>On Mon, Apr 02, 2018 at 12:32:45PM -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:</p> <blockquote> <p>Throwing away the dirty pages <em>and</em> persisting the error seems a lot more reasonable. Then provide a fcntl (or whatever) extension that can clear the error status in the few cases that the application that wants to gracefully deal with the case.</p> </blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> </blockquote> <p>Meh^2.</p> <p>&quot;no reason&quot; - except that there's absolutely no way to know what state the data is in. And that your application needs explicit handling of such failures. And that one FD might be used in a lots of different parts of the application, that fsyncs in one part of the application might be an ok failure, and in another not. Requiring explicit actions to acknowledge &quot;we've thrown away your data for unknown reason&quot; seems entirely reasonable.</p> </blockquote> <p>As long as fsync() indicates error on first invocation, the application is fully aware that between this point of time and the last call to fsync() data has been lost. Persisting this error any further does not change this or add any new info - on the contrary it adds confusion as subsequent write()s and fsync()s on other pages can succeed, but will be reported as failures.</p> <p>The application will need to deal with that first error irrespective of subsequent return codes from fsync(). Conceptually every fsync() invocation demarcates an epoch for which it reports potential errors, so the caller needs to take responsibility for that particular epoch.</p> <p>Callers that are not affected by the potential outcome of fsync() and do not react on errors, have no reason for calling it in the first place (and thus masking failure from subsequent callers that may indeed care).</p> <hr /> <pre><code>From:Stephen Frost &lt;sfrost(at)snowman(dot)net&gt; Date:2018-04-02 20:58:08 </code></pre> <p>Greetings,</p> <p>Anthony Iliopoulos (ailiop(at)altatus(dot)com) wrote:</p> <blockquote> <p>On Mon, Apr 02, 2018 at 12:32:45PM -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:</p> <blockquote> <p>Throwing away the dirty pages <em>and</em> persisting the error seems a lot more reasonable. Then provide a fcntl (or whatever) extension that can clear the error status in the few cases that the application that wants to gracefully deal with the case.</p> </blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> </blockquote> <p>Meh^2.</p> <p>&quot;no reason&quot; - except that there's absolutely no way to know what state the data is in. And that your application needs explicit handling of such failures. And that one FD might be used in a lots of different parts of the application, that fsyncs in one part of the application might be an ok failure, and in another not. Requiring explicit actions to acknowledge &quot;we've thrown away your data for unknown reason&quot; seems entirely reasonable.</p> </blockquote> <p>As long as fsync() indicates error on first invocation, the application is fully aware that between this point of time and the last call to fsync() data has been lost. Persisting this error any further does not change this or add any new info - on the contrary it adds confusion as subsequent write()s and fsync()s on other pages can succeed, but will be reported as failures.</p> </blockquote> <p>fsync() doesn't reflect the status of given pages, however, it reflects the status of the file descriptor, and as such the file, on which it's called. This notion that fsync() is actually only responsible for the changes which were made to a file since the last fsync() call is pure foolishness. If we were able to pass a list of pages or data ranges to fsync() for it to verify they're on disk then perhaps things would be different, but we can't, all we can do is ask to &quot;please flush all the dirty pages associated with this file descriptor, which represents this file we opened, to disk, and let us know if you were successful.&quot;</p> <p>Give us a way to ask &quot;are these specific pages written out to persistant storage?&quot; and we would certainly be happy to use it, and to repeatedly try to flush out pages which weren't synced to disk due to some transient error, and to track those cases and make sure that we don't incorrectly assume that they've been transferred to persistent storage.</p> <blockquote> <p>The application will need to deal with that first error irrespective of subsequent return codes from fsync(). Conceptually every fsync() invocation demarcates an epoch for which it reports potential errors, so the caller needs to take responsibility for that particular epoch.</p> </blockquote> <p>We do deal with that error- by realizing that it failed and later <em>retrying</em> the fsync(), which is when we get back an &quot;all good! everything with this file descriptor you've opened is sync'd!&quot; and happily expect that to be truth, when, in reality, it's an unfortunate lie and there are still pages associated with that file descriptor which are, in reality, dirty and not sync'd to disk.</p> <p>Consider two independent programs where the first one writes to a file and then calls the second one whose job it is to go out and fsync(), perhaps async from the first, those files. Is the second program supposed to go write to each page that the first one wrote to, in order to ensure that all the dirty bits are set so that the fsync() will actually return if all the dirty pages are written?</p> <blockquote> <p>Callers that are not affected by the potential outcome of fsync() and do not react on errors, have no reason for calling it in the first place (and thus masking failure from subsequent callers that may indeed care).</p> </blockquote> <p>Reacting on an error from an fsync() call could, based on how it's documented and actually implemented in other OS's, mean &quot;run another fsync() to see if the error has resolved itself.&quot; Requiring that to mean &quot;you have to go dirty all of the pages you previously dirtied to actually get a subsequent fsync() to do anything&quot; is really just not reasonable- a given program may have no idea what was written to previously nor any particular reason to need to know, on the expectation that the fsync() call will flush any dirty pages, as it's documented to do.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-02 23:05:44 </code></pre> <p>Hi Stephen,</p> <p>On Mon, Apr 02, 2018 at 04:58:08PM -0400, Stephen Frost wrote:</p> <blockquote> <p>fsync() doesn't reflect the status of given pages, however, it reflects the status of the file descriptor, and as such the file, on which it's called. This notion that fsync() is actually only responsible for the changes which were made to a file since the last fsync() call is pure foolishness. If we were able to pass a list of pages or data ranges to fsync() for it to verify they're on disk then perhaps things would be different, but we can't, all we can do is ask to &quot;please flush all the dirty pages associated with this file descriptor, which represents this file we opened, to disk, and let us know if you were successful.&quot;</p> <p>Give us a way to ask &quot;are these specific pages written out to persistant storage?&quot; and we would certainly be happy to use it, and to repeatedly try to flush out pages which weren't synced to disk due to some transient error, and to track those cases and make sure that we don't incorrectly assume that they've been transferred to persistent storage.</p> </blockquote> <p>Indeed fsync() is simply a rather blunt instrument and a narrow legacy interface but further changing its established semantics (no matter how unreasonable they may be) is probably not the way to go.</p> <p>Would using sync_file_range() be helpful? Potential errors would only apply to pages that cover the requested file ranges. There are a few caveats though:</p> <p>(a) it still messes with the top-level error reporting so mixing it with callers that use fsync() and do care about errors will produce the same issue (clearing the error status).</p> <p>(b) the error-reporting granularity is coarse (failure reporting applies to the entire requested range so you still don't know which particular pages/file sub-ranges failed writeback)</p> <p>(c) the same &quot;report and forget&quot; semantics apply to repeated invocations of the sync_file_range() call, so again action will need to be taken upon first error encountered for the particular ranges.</p> <blockquote> <blockquote> <p>The application will need to deal with that first error irrespective of subsequent return codes from fsync(). Conceptually every fsync() invocation demarcates an epoch for which it reports potential errors, so the caller needs to take responsibility for that particular epoch.</p> </blockquote> <p>We do deal with that error- by realizing that it failed and later <em>retrying</em> the fsync(), which is when we get back an &quot;all good! everything with this file descriptor you've opened is sync'd!&quot; and happily expect that to be truth, when, in reality, it's an unfortunate lie and there are still pages associated with that file descriptor which are, in reality, dirty and not sync'd to disk.</p> </blockquote> <p>It really turns out that this is not how the fsync() semantics work though, exactly because the nature of the errors: even if the kernel retained the dirty bits on the failed pages, retrying persisting them on the same disk location would simply fail. Instead the kernel opts for marking those pages clean (since there is no other recovery strategy), and reporting once to the caller who can potentially deal with it in some manner. It is sadly a bad and undocumented convention.</p> <blockquote> <p>Consider two independent programs where the first one writes to a file and then calls the second one whose job it is to go out and fsync(), perhaps async from the first, those files. Is the second program supposed to go write to each page that the first one wrote to, in order to ensure that all the dirty bits are set so that the fsync() will actually return if all the dirty pages are written?</p> </blockquote> <p>I think what you have in mind are the semantics of sync() rather than fsync(), but as long as an application needs to ensure data are persisted to storage, it needs to retain those data in its heap until fsync() is successful instead of discarding them and relying on the kernel after write(). The pattern should be roughly like: write() -&gt; fsync() -&gt; free(), rather than write() -&gt; free() -&gt; fsync(). For example, if a partition gets full upon fsync(), then the application has a chance to persist the data in a different location, while the kernel cannot possibly make this decision and recover.</p> <blockquote> <blockquote> <p>Callers that are not affected by the potential outcome of fsync() and do not react on errors, have no reason for calling it in the first place (and thus masking failure from subsequent callers that may indeed care).</p> </blockquote> <p>Reacting on an error from an fsync() call could, based on how it's documented and actually implemented in other OS's, mean &quot;run another fsync() to see if the error has resolved itself.&quot; Requiring that to mean &quot;you have to go dirty all of the pages you previously dirtied to actually get a subsequent fsync() to do anything&quot; is really just not reasonable- a given program may have no idea what was written to previously nor any particular reason to need to know, on the expectation that the fsync() call will flush any dirty pages, as it's documented to do.</p> </blockquote> <p>I think we are conflating a few issues here: having the OS kernel being responsible for error recovery (so that subsequent fsync() would fix the problems) is one. This clearly is a design which most kernels have not really adopted for reasons outlined above (although having the FS layer recovering from hard errors transparently is open for discussion from what it seems [1]). Now, there is the issue of granularity of error reporting: userspace could benefit from a fine-grained indication of failed pages (or file ranges). Another issue is that of reporting semantics (report and clear), which is also a design choice made to avoid having higher-resolution error tracking and the corresponding memory overheads [1].</p> <p>[1] <a href="https://lwn.net/Articles/718734/">https://lwn.net/Articles/718734/</a></p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-02 23:23:24 </code></pre> <p>On 2018-04-03 01:05:44 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>Would using sync_file_range() be helpful? Potential errors would only apply to pages that cover the requested file ranges. There are a few caveats though:</p> </blockquote> <p>To quote sync_file_range(2):</p> <pre><code> Warning This system call is extremely dangerous and should not be used in portable programs. None of these operations writes out the file's metadata. Therefore, unless the application is strictly performing overwrites of already-instantiated disk blocks, there are no guarantees that the data will be available after a crash. There is no user interface to know if a write is purely an over‐ write. On filesystems using copy-on-write semantics (e.g., btrfs) an overwrite of existing allocated blocks is impossible. When writing into preallocated space, many filesystems also require calls into the block allocator, which this system call does not sync out to disk. This system call does not flush disk write caches and thus does not provide any data integrity on systems with volatile disk write caches. </code></pre> <p>Given the lack of metadata safety that seems entirely a no go. We use sfr(2), but only to force the kernel's hand around writing back earlier without throwing away cache contents.</p> <blockquote> <blockquote> <blockquote> <p>The application will need to deal with that first error irrespective of subsequent return codes from fsync(). Conceptually every fsync() invocation demarcates an epoch for which it reports potential errors, so the caller needs to take responsibility for that particular epoch.</p> </blockquote> <p>We do deal with that error- by realizing that it failed and later <em>retrying</em> the fsync(), which is when we get back an &quot;all good! everything with this file descriptor you've opened is sync'd!&quot; and happily expect that to be truth, when, in reality, it's an unfortunate lie and there are still pages associated with that file descriptor which are, in reality, dirty and not sync'd to disk.</p> </blockquote> <p>It really turns out that this is not how the fsync() semantics work though</p> </blockquote> <p>Except on freebsd and solaris, and perhaps others.</p> <blockquote> <p>, exactly because the nature of the errors: even if the kernel retained the dirty bits on the failed pages, retrying persisting them on the same disk location would simply fail.</p> </blockquote> <p>That's not guaranteed at all, think NFS.</p> <blockquote> <p>Instead the kernel opts for marking those pages clean (since there is no other recovery strategy), and reporting once to the caller who can potentially deal with it in some manner. It is sadly a bad and undocumented convention.</p> </blockquote> <p>It's broken behaviour justified post facto with the only rational that was available, which explains why it's so unconvincing. You could just say &quot;this ship has sailed, and it's to onerous to change because xxx&quot; and this'd be a done deal. But claiming this is reasonable behaviour is ridiculous.</p> <p>Again, you could just continue to error for this fd and still throw away the data.</p> <blockquote> <blockquote> <p>Consider two independent programs where the first one writes to a file and then calls the second one whose job it is to go out and fsync(), perhaps async from the first, those files. Is the second program supposed to go write to each page that the first one wrote to, in order to ensure that all the dirty bits are set so that the fsync() will actually return if all the dirty pages are written?</p> </blockquote> <p>I think what you have in mind are the semantics of sync() rather than fsync()</p> </blockquote> <p>If you open the same file with two fds, and write with one, and fsync with another that's definitely supposed to work. And sync() isn't a realistic replacement in any sort of way because it's obviously systemwide, and thus entirely and completely unsuitable. Nor does it have any sort of better error reporting behaviour, does it?</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-02 23:27:35 </code></pre> <p>On 3 April 2018 at 07:05, Anthony Iliopoulos wrote:</p> <blockquote> <p>Hi Stephen,</p> <p>On Mon, Apr 02, 2018 at 04:58:08PM -0400, Stephen Frost wrote:</p> <blockquote> <p>fsync() doesn't reflect the status of given pages, however, it reflects the status of the file descriptor, and as such the file, on which it's called. This notion that fsync() is actually only responsible for the changes which were made to a file since the last fsync() call is pure foolishness. If we were able to pass a list of pages or data ranges to fsync() for it to verify they're on disk then perhaps things would be different, but we can't, all we can do is ask to &quot;please flush all the dirty pages associated with this file descriptor, which represents this file we opened, to disk, and let us know if you were successful.&quot;</p> <p>Give us a way to ask &quot;are these specific pages written out to persistant storage?&quot; and we would certainly be happy to use it, and to repeatedly try to flush out pages which weren't synced to disk due to some transient error, and to track those cases and make sure that we don't incorrectly assume that they've been transferred to persistent storage.</p> </blockquote> <p>Indeed fsync() is simply a rather blunt instrument and a narrow legacy interface but further changing its established semantics (no matter how unreasonable they may be) is probably not the way to go.</p> </blockquote> <p>They're undocumented and extremely surprising semantics that are arguably a violation of the POSIX spec for fsync(), or at least a surprising interpretation of it.</p> <p>So I don't buy this argument.</p> <blockquote> <p>It really turns out that this is not how the fsync() semantics work though, exactly because the nature of the errors: even if the kernel retained the dirty bits on the failed pages, retrying persisting them on the same disk location would simply fail.</p> </blockquote> <p><em>might</em> simply fail.</p> <p>It depends on why the error ocurred.</p> <p>I originally identified this behaviour on a multipath system. Multipath defaults to &quot;throw the writes away, nobody really cares anyway&quot; on error. It seems to figure a higher level will retry, or the application will receive the error and retry.</p> <p>(See no_path_retry in multipath config. AFAICS the default is insanely dangerous and only suitable for specialist apps that understand the quirks; you should use no_path_retry=queue).</p> <blockquote> <p>Instead the kernel opts for marking those pages clean (since there is no other recovery strategy),</p> <p>and reporting once to the caller who can potentially deal with it in some manner. It is sadly a bad and undocumented convention.</p> </blockquote> <p>It could mark the FD.</p> <p>It's not just undocumented, it's a slightly creative interpretation of the POSIX spec for fsync.</p> <blockquote> <blockquote> <p>Consider two independent programs where the first one writes to a file and then calls the second one whose job it is to go out and fsync(), perhaps async from the first, those files. Is the second program supposed to go write to each page that the first one wrote to, in order to ensure that all the dirty bits are set so that the fsync() will actually return if all the dirty pages are written?</p> </blockquote> <p>I think what you have in mind are the semantics of sync() rather than fsync(), but as long as an application needs to ensure data are persisted to storage, it needs to retain those data in its heap until fsync() is successful instead of discarding them and relying on the kernel after write().</p> </blockquote> <p>This is almost exactly what we tell application authors using PostgreSQL: the data isn't written until you receive a successful commit confirmation, so you'd better not forget it.</p> <p>We provide applications with <em>clear boundaries</em> so they can know <em>exactly</em> what was, and was not, written. I guess the argument from the kernel is the same is true: whatever was written since the last <em>successful</em> fsync is potentially lost and must be redone.</p> <p>But the fsync behaviour is utterly undocumented and dubiously standard.</p> <blockquote> <p>I think we are conflating a few issues here: having the OS kernel being responsible for error recovery (so that subsequent fsync() would fix the problems) is one. This clearly is a design which most kernels have not really adopted for reasons outlined above</p> </blockquote> <p>[citation needed]</p> <p>What do other major platforms do here? The post above suggests it's a bit of a mix of behaviours.</p> <blockquote> <p>Now, there is the issue of granularity of error reporting: userspace could benefit from a fine-grained indication of failed pages (or file ranges).</p> </blockquote> <p>Yep. I looked at AIO in the hopes that, if we used AIO, we'd be able to map a sync failure back to an individual AIO write.</p> <p>But it seems AIO just adds more problems and fixes none. Flush behaviour with AIO from what I can tell is inconsistent version to version and generally unhelpful. The kernel should really report such sync failures back to the app on its AIO write mapping, but it seems nothing of the sort happens.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-03 00:03:39 </code></pre> <blockquote> <p>On Apr 2, 2018, at 16:27, Craig Ringer wrote:</p> <p>They're undocumented and extremely surprising semantics that are arguably a violation of the POSIX spec for fsync(), or at least a surprising interpretation of it.</p> </blockquote> <p>Even accepting that (I personally go with surprising over violation, as if my vote counted), it is highly unlikely that we will convince every kernel team to declare &quot;What fools we've been!&quot; and push a change... and even if they did, PostgreSQL can look forward to many years of running on kernels with the broken semantics. Given that, I think the PANIC option is the soundest one, as unappetizing as it is.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-03 00:05:09 </code></pre> <p>On April 2, 2018 5:03:39 PM PDT, Christophe Pettus wrote:</p> <blockquote> <blockquote> <p>On Apr 2, 2018, at 16:27, Craig Ringer wrote:</p> <p>They're undocumented and extremely surprising semantics that are arguably a violation of the POSIX spec for fsync(), or at least a surprising interpretation of it.</p> </blockquote> <p>Even accepting that (I personally go with surprising over violation, as if my vote counted), it is highly unlikely that we will convince every kernel team to declare &quot;What fools we've been!&quot; and push a change... and even if they did, PostgreSQL can look forward to many years of running on kernels with the broken semantics. Given that, I think the PANIC option is the soundest one, as unappetizing as it is.</p> </blockquote> <p>Don't we pretty much already have agreement in that? And Craig is the main proponent of it?</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-03 00:07:41 </code></pre> <blockquote> <p>On Apr 2, 2018, at 17:05, Andres Freund wrote:</p> <p>Don't we pretty much already have agreement in that? And Craig is the main proponent of it?</p> </blockquote> <p>For sure on the second sentence; the first was not clear to me.</p> <hr /> <pre><code>From:Peter Geoghegan &lt;pg(at)bowt(dot)ie&gt; Date:2018-04-03 00:48:00 </code></pre> <p>On Mon, Apr 2, 2018 at 5:05 PM, Andres Freund wrote:</p> <blockquote> <blockquote> <p>Even accepting that (I personally go with surprising over violation, as if my vote counted), it is highly unlikely that we will convince every kernel team to declare &quot;What fools we've been!&quot; and push a change... and even if they did, PostgreSQL can look forward to many years of running on kernels with the broken semantics. Given that, I think the PANIC option is the soundest one, as unappetizing as it is.</p> </blockquote> <p>Don't we pretty much already have agreement in that? And Craig is the main proponent of it?</p> </blockquote> <p>I wonder how bad it will be in practice if we PANIC. Craig said &quot;This isn't as bad as it seems because AFAICS fsync only returns EIO in cases where we should be stopping the world anyway, and many FSes will do that for us&quot;. It would be nice to get more information on that.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-03 01:29:28 </code></pre> <p>On Tue, Apr 3, 2018 at 3:03 AM, Craig Ringer wrote:</p> <blockquote> <p>I see little benefit to not just PANICing unconditionally on EIO, really. It shouldn't happen, and if it does, we want to be pretty conservative and adopt a data-protective approach.</p> <p>I'm rather more worried by doing it on ENOSPC. Which looks like it might be necessary from what I recall finding in my test case + kernel code reading. I really don't want to respond to a possibly-transient ENOSPC by PANICing the whole server unnecessarily.</p> </blockquote> <p>Yeah, it'd be nice to give an administrator the chance to free up some disk space after ENOSPC is reported, and stay up. Running out of space really shouldn't take down the database without warning! The question is whether the data remains in cache and marked dirty, so that retrying is a safe option (since it's potentially gone from our own buffers, so if the OS doesn't have it the only place your committed data can definitely still be found is the WAL... recovery time). Who can tell us? Do we need a per-filesystem answer? Delayed allocation is a somewhat filesystem-specific thing, so maybe. Interestingly, there don't seem to be many operating systems that can report ENOSPC from fsync(), based on a quick scan through some documentation:</p> <pre><code>POSIX, AIX, HP-UX, FreeBSD, OpenBSD, NetBSD: no Illumos/Solaris, Linux, macOS: yes </code></pre> <p>I don't know if macOS really means it or not; it just tells you to see the errors for read(2) and write(2). By the way, speaking of macOS, I was curious to see if the common BSD heritage would show here. Yeah, somewhat. It doesn't appear to keep buffers on writeback error, if this is the right code<a href="http://danluu.com/though it could be handling it somewhereelse for all I know">1</a>.</p> <p>[1] <a href="https://github.com/apple/darwin-xnu/blob/master/bsd/vfs/vfs_bio.c#L2695">https://github.com/apple/darwin-xnu/blob/master/bsd/vfs/vfs_bio.c#L2695</a></p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-03 02:54:26 </code></pre> <p>On Mon, Apr 2, 2018 at 2:53 PM, Anthony Iliopoulos wrote:</p> <blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> </blockquote> <p>Like other people here, I think this is 100% unreasonable, starting with &quot;the dirty pages which cannot been written out are practically thrown away&quot;. Who decided that was OK, and on the basis of what wording in what specification? I think it's always unreasonable to throw away the user's data. If the writes are going to fail, then let them keep on failing every time. <em>That</em> wouldn't cause any data loss, because we'd never be able to checkpoint, and eventually the user would have to kill the server uncleanly, and that would trigger recovery.</p> <p>Also, this really does make it impossible to write reliable programs. Imagine that, while the server is running, somebody runs a program which opens a file in the data directory, calls fsync() on it, and closes it. If the fsync() fails, postgres is now borked and has no way of being aware of the problem. If we knew, we could PANIC, but we'll never find out, because the unrelated process ate the error. This is exactly the sort of ill-considered behavior that makes fcntl() locking nearly useless.</p> <p>Even leaving that aside, a PANIC means a prolonged outage on a prolonged system - it could easily take tens of minutes or longer to run recovery. So saying &quot;oh, just do that&quot; is not really an answer. Sure, we can do it, but it's like trying to lose weight by intentionally eating a tapeworm. Now, it's possible to shorten the checkpoint_timeout so that recovery runs faster, but then performance drops because data has to be fsync()'d more often instead of getting buffered in the OS cache for the maximum possible time. We could also dodge this issue in another way: suppose that when we write a page out, we don't consider it really written until fsync() succeeds. Then we wouldn't need to PANIC if an fsync() fails; we could just re-write the page. Unfortunately, this would also be terrible for performance, for pretty much the same reasons: letting the OS cache absorb lots of dirty blocks and do write-combining is <em>necessary</em> for good performance.</p> <blockquote> <p>The error reporting is thus consistent with the intended semantics (which are sadly not properly documented). Repeated calls to fsync() simply do not imply that the kernel will retry to writeback the previously-failed pages, so the application needs to be aware of that. Persisting the error at the fsync() level would essentially mean moving application policy into the kernel.</p> </blockquote> <p>I might accept this argument if I accepted that it was OK to decide that an fsync() failure means you can forget that the write() ever happened in the first place, but it's hard to imagine an application that wants that behavior. If the application didn't care about whether the bytes really got to disk or not, it would not have called fsync() in the first place. If it does care, reporting the error only once is never an improvement.</p> <hr /> <pre><code>From:Peter Geoghegan &lt;pg(at)bowt(dot)ie&gt; Date:2018-04-03 03:45:30 </code></pre> <p>On Mon, Apr 2, 2018 at 7:54 PM, Robert Haas wrote:</p> <blockquote> <p>Also, this really does make it impossible to write reliable programs. Imagine that, while the server is running, somebody runs a program which opens a file in the data directory, calls fsync() on it, and closes it. If the fsync() fails, postgres is now borked and has no way of being aware of the problem. If we knew, we could PANIC, but we'll never find out, because the unrelated process ate the error. This is exactly the sort of ill-considered behavior that makes fcntl() locking nearly useless.</p> </blockquote> <p>I fear that the conventional wisdom from the Kernel people is now &quot;you should be using O_DIRECT for granular control&quot;. The LWN article Thomas linked (<a href="https://lwn.net/Articles/718734">https://lwn.net/Articles/718734</a>) cites Ted Ts'o:</p> <p>&quot;Monakhov asked why a counter was needed; Layton said it was to handle multiple overlapping writebacks. Effectively, the counter would record whether a writeback had failed since the file was opened or since the last fsync(). Ts'o said that should be fine; applications that want more information should use O_DIRECT. For most applications, knowledge that an error occurred somewhere in the file is all that is necessary; applications that require better granularity already use O_DIRECT.&quot;</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-03 10:35:39 </code></pre> <p>Hi Robert,</p> <p>On Mon, Apr 02, 2018 at 10:54:26PM -0400, Robert Haas wrote:</p> <blockquote> <p>On Mon, Apr 2, 2018 at 2:53 PM, Anthony Iliopoulos wrote:</p> <blockquote> <p>Given precisely that the dirty pages which cannot been written-out are practically thrown away, the semantics of fsync() (after the 4.13 fixes) are essentially correct: the first call indicates that a writeback error indeed occurred, while subsequent calls have no reason to indicate an error (assuming no other errors occurred in the meantime).</p> </blockquote> <p>Like other people here, I think this is 100% unreasonable, starting with &quot;the dirty pages which cannot been written out are practically thrown away&quot;. Who decided that was OK, and on the basis of what wording in what specification? I think it's always unreasonable to</p> </blockquote> <p>If you insist on strict conformance to POSIX, indeed the linux glibc configuration and associated manpage are probably wrong in stating that _POSIX_SYNCHRONIZED_IO is supported. The implementation matches that of the flexibility allowed by not supporting SIO. There's a long history of brokenness between linux and posix, and I think there was never an intention of conforming to the standard.</p> <blockquote> <p>throw away the user's data. If the writes are going to fail, then let them keep on failing every time. <em>That</em> wouldn't cause any data loss, because we'd never be able to checkpoint, and eventually the user would have to kill the server uncleanly, and that would trigger recovery.</p> </blockquote> <p>I believe (as tried to explain earlier) there is a certain assumption being made that the writer and original owner of data is responsible for dealing with potential errors in order to avoid data loss (which should be only of interest to the original writer anyway). It would be very questionable for the interface to persist the error while subsequent writes and fsyncs to different offsets may as well go through. Another process may need to write into the file and fsync, while being unaware of those newly introduced semantics is now faced with EIO because some unrelated previous process failed some earlier writes and did not bother to clear the error for those writes. In a similar scenario where the second process is aware of the new semantics, it would naturally go ahead and clear the global error in order to proceed with its own write()+fsync(), which would essentially amount to the same problematic semantics you have now.</p> <blockquote> <p>Also, this really does make it impossible to write reliable programs. Imagine that, while the server is running, somebody runs a program which opens a file in the data directory, calls fsync() on it, and closes it. If the fsync() fails, postgres is now borked and has no way of being aware of the problem. If we knew, we could PANIC, but we'll never find out, because the unrelated process ate the error. This is exactly the sort of ill-considered behavior that makes fcntl() locking nearly useless.</p> </blockquote> <p>Fully agree, and the errseq_t fixes have dealt exactly with the issue of making sure that the error is reported to all file descriptors that <em>happen to be open at the time of error</em>. But I think one would have a hard time defending a modification to the kernel where this is further extended to cover cases where:</p> <p>process A does write() on some file offset which fails writeback, fsync() gets EIO and exit()s.</p> <p>process B does write() on some other offset which succeeds writeback, but fsync() gets EIO due to (uncleared) failures of earlier process.</p> <p>This would be a highly user-visible change of semantics from edge- triggered to level-triggered behavior.</p> <blockquote> <p>dodge this issue in another way: suppose that when we write a page out, we don't consider it really written until fsync() succeeds. Then</p> </blockquote> <p>That's the only way to think about fsync() guarantees unless you are on a kernel that keeps retrying to persist dirty pages. Assuming such a model, after repeated and unrecoverable hard failures the process would have to explicitly inform the kernel to drop the dirty pages. All the process could do at that point is read back to userspace the dirty/failed pages and attempt to rewrite them at a different place (which is current possible too). Most applications would not bother though to inform the kernel and drop the permanently failed pages; and thus someone eventually would hit the case that a large amount of failed writeback pages are running his server out of memory, at which point people will complain that those semantics are completely unreasonable.</p> <blockquote> <p>we wouldn't need to PANIC if an fsync() fails; we could just re-write the page. Unfortunately, this would also be terrible for performance, for pretty much the same reasons: letting the OS cache absorb lots of dirty blocks and do write-combining is <em>necessary</em> for good performance.</p> </blockquote> <p>Not sure I understand this case. The application may indeed re-write a bunch of pages that have failed and proceed with fsync(). The kernel will deal with combining the writeback of all the re-written pages. But further the necessity of combining for performance really depends on the exact storage medium. At the point you start caring about write-combining, the kernel community will naturally redirect you to use DIRECT_IO.</p> <blockquote> <blockquote> <p>The error reporting is thus consistent with the intended semantics (which are sadly not properly documented). Repeated calls to fsync() simply do not imply that the kernel will retry to writeback the previously-failed pages, so the application needs to be aware of that. Persisting the error at the fsync() level would essentially mean moving application policy into the kernel.</p> </blockquote> <p>I might accept this argument if I accepted that it was OK to decide that an fsync() failure means you can forget that the write() ever happened in the first place, but it's hard to imagine an application that wants that behavior. If the application didn't care about whether the bytes really got to disk or not, it would not have called fsync() in the first place. If it does care, reporting the error only once is never an improvement.</p> </blockquote> <p>Again, conflating two separate issues, that of buffering and retrying failed pages and that of error reporting. Yes it would be convenient for applications not to have to care at all about recovery of failed write-backs, but at some point they would have to face this issue one way or another (I am assuming we are always talking about hard failures, other kinds of failures are probably already being dealt with transparently at the kernel level).</p> <p>As for the reporting, it is also unreasonable to effectively signal and persist an error on a file-wide granularity while it pertains to subsets of that file and other writes can go through, but I am repeating myself.</p> <p>I suppose that if the check-and-clear semantics are problematic for Pg, one could suggest a kernel patch that opts-in to a level-triggered reporting of fsync() on a per-descriptor basis, which seems to be non-intrusive and probably sufficient to cover your expected use-case.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-03 11:26:05 </code></pre> <p>On 3 April 2018 at 11:35, Anthony Iliopoulos wrote:</p> <blockquote> <p>Hi Robert,</p> <p>Fully agree, and the errseq_t fixes have dealt exactly with the issue of making sure that the error is reported to all file descriptors that <em>happen to be open at the time of error</em>. But I think one would have a hard time defending a modification to the kernel where this is further extended to cover cases where:</p> <p>process A does write() on some file offset which fails writeback, fsync() gets EIO and exit()s.</p> <p>process B does write() on some other offset which succeeds writeback, but fsync() gets EIO due to (uncleared) failures of earlier process.</p> </blockquote> <p>Surely that's exactly what process B would want? If it calls fsync and gets a success and later finds out that the file is corrupt and didn't match what was in memory it's not going to be happy.</p> <p>This seems like an attempt to co-opt fsync for a new and different purpose for which it's poorly designed. It's not an async error reporting mechanism for writes. It would be useless as that as any process could come along and open your file and eat the errors for writes you performed. An async error reporting mechanism would have to document which writes it was giving errors for and give you ways to control that.</p> <p>The semantics described here are useless for everyone. For a program needing to know the error status of the writes it executed, it doesn't know which writes are included in which fsync call. For a program using fsync for its original intended purpose of guaranteeing that the all writes are synced to disk it no longer has any guarantee at all.</p> <blockquote> <p>This would be a highly user-visible change of semantics from edge- triggered to level-triggered behavior.</p> </blockquote> <p>It was always documented as level-triggered. This edge-triggered concept is a completely surprise to application writers.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-03 13:36:47 </code></pre> <p>On Tue, Apr 03, 2018 at 12:26:05PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 3 April 2018 at 11:35, Anthony Iliopoulos wrote:</p> <blockquote> <p>Hi Robert,</p> <p>Fully agree, and the errseq_t fixes have dealt exactly with the issue of making sure that the error is reported to all file descriptors that <em>happen to be open at the time of error</em>. But I think one would have a hard time defending a modification to the kernel where this is further extended to cover cases where:</p> <p>process A does write() on some file offset which fails writeback, fsync() gets EIO and exit()s.</p> <p>process B does write() on some other offset which succeeds writeback, but fsync() gets EIO due to (uncleared) failures of earlier process.</p> </blockquote> <p>Surely that's exactly what process B would want? If it calls fsync and gets a success and later finds out that the file is corrupt and didn't match what was in memory it's not going to be happy.</p> </blockquote> <p>You can't possibly make this assumption. Process B may be reading and writing to completely disjoint regions from those of process A, and as such not really caring about earlier failures, only wanting to ensure its own writes go all the way through. But even if it did care, the file interfaces make no transactional guarantees. Even without fsync() there is nothing preventing process B from reading dirty pages from process A, and based on their content proceed to to its own business and write/persist new data, while process A further modifies the not-yet-flushed pages in-memory before flushing. In this case you'd need explicit synchronization/locking between the processes anyway, so why would fsync() be an exception?</p> <blockquote> <p>This seems like an attempt to co-opt fsync for a new and different purpose for which it's poorly designed. It's not an async error reporting mechanism for writes. It would be useless as that as any process could come along and open your file and eat the errors for writes you performed. An async error reporting mechanism would have to document which writes it was giving errors for and give you ways to control that.</p> </blockquote> <p>The errseq_t fixes deal with that; errors will be reported to any process that has an open fd, irrespective to who is the actual caller of the fsync() that may have induced errors. This is anyway required as the kernel may evict dirty pages on its own by doing writeback and as such there needs to be a way to report errors on all open fds.</p> <blockquote> <p>The semantics described here are useless for everyone. For a program needing to know the error status of the writes it executed, it doesn't know which writes are included in which fsync call. For a program</p> </blockquote> <p>If EIO persists between invocations until explicitly cleared, a process cannot possibly make any decision as to if it should clear the error and proceed or some other process will need to leverage that without coordination, or which writes actually failed for that matter. We would be back to the case of requiring explicit synchronization between processes that care about this, in which case the processes may as well synchronize over calling fsync() in the first place.</p> <p>Having an opt-in persisting EIO per-fd would practically be a form of &quot;contract&quot; between &quot;cooperating&quot; processes anyway.</p> <p>But instead of deconstructing and debating the semantics of the current mechanism, why not come up with the ideal desired form of error reporting/tracking granularity etc., and see how this may be fitted into kernels as a new interface.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-03 14:29:10 </code></pre> <p>On 3 April 2018 at 10:54, Robert Haas wrote:</p> <blockquote> <p>I think it's always unreasonable to throw away the user's data.</p> </blockquote> <p>Well, we do that. If a txn aborts, all writes in the txn are discarded.</p> <p>I think that's perfectly reasonable. Though we also promise an all or nothing effect, we make exceptions even there.</p> <p>The FS doesn't offer transactional semantics, but the fsync behaviour can be interpreted kind of similarly.</p> <p>I don't <em>agree</em> with it, but I don't think it's as wholly unreasonable as all that. I think leaving it undocumented is absolutely gobsmacking, and it's dubious at best, but it's not totally insane.</p> <blockquote> <p>If the writes are going to fail, then let them keep on failing every time.</p> </blockquote> <p>Like we do, where we require an explicit rollback.</p> <p>But POSIX may pose issues there, it doesn't really define any interface for that AFAIK. Unless you expect the app to close() and re-open() the file. Replacing one nonstandard issue with another may not be a win.</p> <blockquote> <p><em>That</em> wouldn't cause any data loss, because we'd never be able to checkpoint, and eventually the user would have to kill the server uncleanly, and that would trigger recovery.</p> </blockquote> <p>Yep. That's what I expected to happen on unrecoverable I/O errors. Because, y'know, unrecoverable.</p> <p>I was stunned to learn it's not so. And I'm even more amazed to learn that ext4's errors=remount-ro apparently doesn't concern its self with mere user data, and may exhibit the same behaviour - I need to rerun my test case on it tomorrow.</p> <blockquote> <p>Also, this really does make it impossible to write reliable programs.</p> </blockquote> <p>In the presence of multiple apps interacting on the same file, yes. I think that's a little bit of a stretch though.</p> <p>For a single app, you can recover by remembering and redoing all the writes you did.</p> <p>Sucks if your app wants to have multiple processes working together on a file without some kind of journal or WAL, relying on fsync() alone, mind you. But at least we have WAL.</p> <p>Hrm. I wonder how this interacts with wal_level=minimal.</p> <blockquote> <p>Even leaving that aside, a PANIC means a prolonged outage on a prolonged system - it could easily take tens of minutes or longer to run recovery. So saying &quot;oh, just do that&quot; is not really an answer. Sure, we can do it, but it's like trying to lose weight by intentionally eating a tapeworm. Now, it's possible to shorten the checkpoint_timeout so that recovery runs faster, but then performance drops because data has to be fsync()'d more often instead of getting buffered in the OS cache for the maximum possible time.</p> </blockquote> <p>It's also spikier. Users have more issues with latency with short, frequent checkpoints.</p> <blockquote> <p>We could also dodge this issue in another way: suppose that when we write a page out, we don't consider it really written until fsync() succeeds. Then we wouldn't need to PANIC if an fsync() fails; we could just re-write the page. Unfortunately, this would also be terrible for performance, for pretty much the same reasons: letting the OS cache absorb lots of dirty blocks and do write-combining is <em>necessary</em> for good performance.</p> </blockquote> <p>Our double-caching is already plenty bad enough anyway, as well.</p> <p>(Ideally I want to be able to swap buffers between shared_buffers and the OS buffer-cache. Almost like a 2nd level of buffer pinning. When we write out a block, we <em>transfer</em> ownership to the OS. Yeah, I'm dreaming. But we'd sure need to be able to trust the OS not to just forget the block then!)</p> <blockquote> <blockquote> <p>The error reporting is thus consistent with the intended semantics (which are sadly not properly documented). Repeated calls to fsync() simply do not imply that the kernel will retry to writeback the previously-failed pages, so the application needs to be aware of that. Persisting the error at the fsync() level would essentially mean moving application policy into the kernel.</p> </blockquote> <p>I might accept this argument if I accepted that it was OK to decide that an fsync() failure means you can forget that the write() ever happened in the first place, but it's hard to imagine an application that wants that behavior. If the application didn't care about whether the bytes really got to disk or not, it would not have called fsync() in the first place. If it does care, reporting the error only once is never an improvement.</p> </blockquote> <p>Many RDBMSes do just that. It's hardly behaviour unique to the kernel. They report an ERROR on a statement in a txn then go on with life, merrily forgetting that anything was ever wrong.</p> <p>I agree with PostgreSQL's stance that this is wrong. We require an explicit rollback (or ROLLBACK TO SAVEPOINT) to restore the session to a usable state. This is good.</p> <p>But we're the odd one out there. Almost everyone else does much like what fsync() does on Linux, report the error and forget it.</p> <p>In any case, we're not going to get anyone to backpatch a fix for this into all kernels, so we're stuck working around it.</p> <p>I'll do some testing with ENOSPC tomorrow, propose a patch, report back.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-03 14:37:30 </code></pre> <p>On 3 April 2018 at 14:36, Anthony Iliopoulos wrote:</p> <blockquote> <p>If EIO persists between invocations until explicitly cleared, a process cannot possibly make any decision as to if it should clear the error</p> </blockquote> <p>I still don't understand what &quot;clear the error&quot; means here. The writes still haven't been written out. We don't care about tracking errors, we just care whether all the writes to the file have been flushed to disk. By &quot;clear the error&quot; you mean throw away the dirty pages and revert part of the file to some old data? Why would anyone ever want that?</p> <blockquote> <p>But instead of deconstructing and debating the semantics of the current mechanism, why not come up with the ideal desired form of error reporting/tracking granularity etc., and see how this may be fitted into kernels as a new interface.</p> </blockquote> <p>Because Postgres is portable software that won't be able to use some Linux-specific interface. And doesn't really need any granular error reporting system anyways. It just needs to know when all writes have been synced to disk.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-03 16:52:07 </code></pre> <p>On Tue, Apr 03, 2018 at 03:37:30PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 3 April 2018 at 14:36, Anthony Iliopoulos wrote:</p> <blockquote> <p>If EIO persists between invocations until explicitly cleared, a process cannot possibly make any decision as to if it should clear the error</p> </blockquote> <p>I still don't understand what &quot;clear the error&quot; means here. The writes still haven't been written out. We don't care about tracking errors, we just care whether all the writes to the file have been flushed to disk. By &quot;clear the error&quot; you mean throw away the dirty pages and revert part of the file to some old data? Why would anyone ever want that?</p> </blockquote> <p>It means that the responsibility of recovering the data is passed back to the application. The writes may never be able to be written out. How would a kernel deal with that? Either discard the data (and have the writer acknowledge) or buffer the data until reboot and simply risk going OOM. It's not what someone would want, but rather <em>need</em> to deal with, one way or the other. At least on the application-level there's a fighting chance for restoring to a consistent state. The kernel does not have that opportunity.</p> <blockquote> <blockquote> <p>But instead of deconstructing and debating the semantics of the current mechanism, why not come up with the ideal desired form of error reporting/tracking granularity etc., and see how this may be fitted into kernels as a new interface.</p> </blockquote> <p>Because Postgres is portable software that won't be able to use some Linux-specific interface. And doesn't really need any granular error</p> </blockquote> <p>I don't really follow this argument, Pg is admittedly using non-portable interfaces (e.g the sync_file_range()). While it's nice to avoid platform specific hacks, expecting that the POSIX semantics will be consistent across systems is simply a 90's pipe dream. While it would be lovely to have really consistent interfaces for application writers, this is simply not going to happen any time soon.</p> <p>And since those problematic semantics of fsync() appear to be prevalent in other systems as well that are not likely to be changed, you cannot rely on preconception that once buffers are handed over to kernel you have a guarantee that they will be eventually persisted no matter what. (Why even bother having fsync() in that case? The kernel would eventually evict and writeback dirty pages anyway. The point of reporting the error back to the application is to give it a chance to recover - the kernel could repeat &quot;fsync()&quot; itself internally if this would solve anything).</p> <blockquote> <p>reporting system anyways. It just needs to know when all writes have been synced to disk.</p> </blockquote> <p>Well, it does know when <em>some</em> writes have <em>not</em> been synced to disk, exactly because the responsibility is passed back to the application. I do realize this puts more burden back to the application, but what would a viable alternative be? Would you rather have a kernel that risks periodically going OOM due to this design decision?</p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-03 21:47:01 </code></pre> <p>On Tue, Apr 3, 2018 at 6:35 AM, Anthony Iliopoulos wrote:</p> <blockquote> <blockquote> <p>Like other people here, I think this is 100% unreasonable, starting with &quot;the dirty pages which cannot been written out are practically thrown away&quot;. Who decided that was OK, and on the basis of what wording in what specification? I think it's always unreasonable to</p> </blockquote> <p>If you insist on strict conformance to POSIX, indeed the linux glibc configuration and associated manpage are probably wrong in stating that _POSIX_SYNCHRONIZED_IO is supported. The implementation matches that of the flexibility allowed by not supporting SIO. There's a long history of brokenness between linux and posix, and I think there was never an intention of conforming to the standard.</p> </blockquote> <p>Well, then the man page probably shouldn't say CONFORMING TO 4.3BSD, POSIX.1-2001, which on the first system I tested, it did. Also, the summary should be changed from the current &quot;fsync, fdatasync - synchronize a file's in-core state with storage device&quot; by adding &quot;, possibly by randomly undoing some of the changes you think you made to the file&quot;.</p> <blockquote> <p>I believe (as tried to explain earlier) there is a certain assumption being made that the writer and original owner of data is responsible for dealing with potential errors in order to avoid data loss (which should be only of interest to the original writer anyway). It would be very questionable for the interface to persist the error while subsequent writes and fsyncs to different offsets may as well go through.</p> </blockquote> <p>No, that's not questionable at all. fsync() doesn't take any argument saying which part of the file you care about, so the kernel is entirely not entitled to assume it knows to which writes a given fsync() call was intended to apply.</p> <blockquote> <p>Another process may need to write into the file and fsync, while being unaware of those newly introduced semantics is now faced with EIO because some unrelated previous process failed some earlier writes and did not bother to clear the error for those writes. In a similar scenario where the second process is aware of the new semantics, it would naturally go ahead and clear the global error in order to proceed with its own write()+fsync(), which would essentially amount to the same problematic semantics you have now.</p> </blockquote> <p>I don't deny that it's possible that somebody could have an application which is utterly indifferent to the fact that earlier modifications to a file failed due to I/O errors, but is A-OK with that as long as later modifications can be flushed to disk, but I don't think that's a normal thing to want.</p> <blockquote> <blockquote> <p>Also, this really does make it impossible to write reliable programs. Imagine that, while the server is running, somebody runs a program which opens a file in the data directory, calls fsync() on it, and closes it. If the fsync() fails, postgres is now borked and has no way of being aware of the problem. If we knew, we could PANIC, but we'll never find out, because the unrelated process ate the error. This is exactly the sort of ill-considered behavior that makes fcntl() locking nearly useless.</p> </blockquote> <p>Fully agree, and the errseq_t fixes have dealt exactly with the issue of making sure that the error is reported to all file descriptors that <em>happen to be open at the time of error</em>.</p> </blockquote> <p>Well, in PostgreSQL, we have a background process called the checkpointer which is the process that normally does all of the fsync() calls but only a subset of the write() calls. The checkpointer does not, however, necessarily have every file open all the time, so these fixes aren't sufficient to make sure that the checkpointer ever sees an fsync() failure. What you have (or someone has) basically done here is made an undocumented assumption about which file descriptors might care about a particular error, but it just so happens that PostgreSQL has never conformed to that assumption. You can keep on saying the problem is with our assumptions, but it doesn't seem like a very good guess to me to suppose that we're the only program that has ever made them. The documentation for fsync() gives zero indication that it's edge-triggered, and so complaining that people wouldn't like it if it became level-triggered seems like an ex post facto justification for a poorly-chosen behavior: they probably think (as we did prior to a week ago) that it already is.</p> <blockquote> <p>Not sure I understand this case. The application may indeed re-write a bunch of pages that have failed and proceed with fsync(). The kernel will deal with combining the writeback of all the re-written pages. But further the necessity of combining for performance really depends on the exact storage medium. At the point you start caring about write-combining, the kernel community will naturally redirect you to use DIRECT_IO.</p> </blockquote> <p>Well, the way PostgreSQL works today, we typically run with say 8GB of shared_buffers even if the system memory is, say, 200GB. As pages are evicted from our relatively small cache to the operating system, we track which files need to be fsync()'d at checkpoint time, but we don't hold onto the blocks. Until checkpoint time, the operating system is left to decide whether it's better to keep caching the dirty blocks (thus leaving less memory for other things, but possibly allowing write-combining if the blocks are written again) or whether it should clean them to make room for other things. This means that only a small portion of the operating system memory is directly managed by PostgreSQL, while allowing the effective size of our cache to balloon to some very large number if the system isn't under heavy memory pressure.</p> <p>Now, I hear the DIRECT_IO thing and I assume we're eventually going to have to go that way: Linux kernel developers seem to think that &quot;real men use O_DIRECT&quot; and so if other forms of I/O don't provide useful guarantees, well that's our fault for not using O_DIRECT. That's a political reason, not a technical reason, but it's a reason all the same.</p> <p>Unfortunately, that is going to add a huge amount of complexity, because if we ran with shared_buffers set to a large percentage of system memory, we couldn't allocate large chunks of memory for sorts and hash tables from the operating system any more. We'd have to allocate it from our own shared_buffers because that's basically all the memory there is and using substantially more might run the system out entirely. So it's a huge, huge architectural change. And even once it's done it is in some ways inferior to what we are doing today -- true, it gives us superior control over writeback timing, but it also makes PostgreSQL play less nicely with other things running on the same machine, because now PostgreSQL has a dedicated chunk of whatever size it has, rather than using some portion of the OS buffer cache that can grow and shrink according to memory needs both of other parts of PostgreSQL and other applications on the system.</p> <blockquote> <p>I suppose that if the check-and-clear semantics are problematic for Pg, one could suggest a kernel patch that opts-in to a level-triggered reporting of fsync() on a per-descriptor basis, which seems to be non-intrusive and probably sufficient to cover your expected use-case.</p> </blockquote> <p>That would certainly be better than nothing.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-03 23:59:27 </code></pre> <p>On Tue, Apr 3, 2018 at 1:29 PM, Thomas Munro wrote:</p> <blockquote> <p>Interestingly, there don't seem to be many operating systems that can report ENOSPC from fsync(), based on a quick scan through some documentation:</p> <p>POSIX, AIX, HP-UX, FreeBSD, OpenBSD, NetBSD: no Illumos/Solaris, Linux, macOS: yes</p> </blockquote> <p>Oops, reading comprehension fail. POSIX yes (since issue 5), via the note that read() and write()'s error conditions can also be returned.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 00:56:37 </code></pre> <p>On Tue, Apr 3, 2018 at 05:47:01PM -0400, Robert Haas wrote:</p> <blockquote> <p>Well, in PostgreSQL, we have a background process called the checkpointer which is the process that normally does all of the fsync() calls but only a subset of the write() calls. The checkpointer does not, however, necessarily have every file open all the time, so these fixes aren't sufficient to make sure that the checkpointer ever sees an fsync() failure.</p> </blockquote> <p>There has been a lot of focus in this thread on the workflow:</p> <pre><code>write() -&gt; blocks remain in kernel memory -&gt; fsync() -&gt; panic? </code></pre> <p>But what happens in this workflow:</p> <pre><code>write() -&gt; kernel syncs blocks to storage -&gt; fsync() </code></pre> <p>Is fsync() going to see a &quot;kernel syncs blocks to storage&quot; failure?</p> <p>There was already discussion that if the fsync() causes the &quot;syncs blocks to storage&quot;, fsync() will only report the failure once, but will it see any failure in the second workflow? There is indication that a failed write to storage reports back an error once and clears the dirty flag, but do we know it keeps things around long enough to report an error to a future fsync()?</p> <p>You would think it does, but I have to ask since our fsync() assumptions have been wrong for so long.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-04 01:54:50 </code></pre> <p>On Wed, Apr 4, 2018 at 12:56 PM, Bruce Momjian wrote:</p> <blockquote> <p>There has been a lot of focus in this thread on the workflow:</p> <pre><code> write() -&gt; blocks remain in kernel memory -&gt; fsync() -&gt; panic? </code></pre> <p>But what happens in this workflow:</p> <pre><code> write() -&gt; kernel syncs blocks to storage -&gt; fsync() </code></pre> <p>Is fsync() going to see a &quot;kernel syncs blocks to storage&quot; failure?</p> <p>There was already discussion that if the fsync() causes the &quot;syncs blocks to storage&quot;, fsync() will only report the failure once, but will it see any failure in the second workflow? There is indication that a failed write to storage reports back an error once and clears the dirty flag, but do we know it keeps things around long enough to report an error to a future fsync()?</p> <p>You would think it does, but I have to ask since our fsync() assumptions have been wrong for so long.</p> </blockquote> <p>I believe there were some problems of that nature (with various twists, based on other concurrent activity and possibly different fds), and those problems were fixed by the errseq_t system developed by Jeff Layton in Linux 4.13. Call that &quot;bug #1&quot;.</p> <p>The second issues is that the pages are marked clean after the error is reported, so further attempts to fsync() the data (in our case for a new attempt to checkpoint) will be futile but appear successful. Call that &quot;bug #2&quot;, with the proviso that some people apparently think it's reasonable behaviour and not a bug. At least there is a plausible workaround for that: namely the nuclear option proposed by Craig.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 02:05:19 </code></pre> <p>On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 12:56 PM, Bruce Momjian wrote:</p> <blockquote> <p>There has been a lot of focus in this thread on the workflow:</p> <pre><code> write() -&gt; blocks remain in kernel memory -&gt; fsync() -&gt; panic? </code></pre> <p>But what happens in this workflow:</p> <pre><code> write() -&gt; kernel syncs blocks to storage -&gt; fsync() </code></pre> <p>Is fsync() going to see a &quot;kernel syncs blocks to storage&quot; failure?</p> <p>There was already discussion that if the fsync() causes the &quot;syncs blocks to storage&quot;, fsync() will only report the failure once, but will it see any failure in the second workflow? There is indication that a failed write to storage reports back an error once and clears the dirty flag, but do we know it keeps things around long enough to report an error to a future fsync()?</p> <p>You would think it does, but I have to ask since our fsync() assumptions have been wrong for so long.</p> </blockquote> <p>I believe there were some problems of that nature (with various twists, based on other concurrent activity and possibly different fds), and those problems were fixed by the errseq_t system developed by Jeff Layton in Linux 4.13. Call that &quot;bug #1&quot;.</p> </blockquote> <p>So all our non-cutting-edge Linux systems are vulnerable and there is no workaround Postgres can implement? Wow.</p> <blockquote> <p>The second issues is that the pages are marked clean after the error is reported, so further attempts to fsync() the data (in our case for a new attempt to checkpoint) will be futile but appear successful. Call that &quot;bug #2&quot;, with the proviso that some people apparently think it's reasonable behaviour and not a bug. At least there is a plausible workaround for that: namely the nuclear option proposed by Craig.</p> </blockquote> <p>Yes, that one I understood.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 02:14:28 </code></pre> <p>On Tue, Apr 3, 2018 at 10:05:19PM -0400, Bruce Momjian wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:</p> <blockquote> <p>I believe there were some problems of that nature (with various twists, based on other concurrent activity and possibly different fds), and those problems were fixed by the errseq_t system developed by Jeff Layton in Linux 4.13. Call that &quot;bug #1&quot;.</p> </blockquote> <p>So all our non-cutting-edge Linux systems are vulnerable and there is no workaround Postgres can implement? Wow.</p> </blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 02:40:16 </code></pre> <p>On 4 April 2018 at 05:47, Robert Haas wrote:</p> <blockquote> <p>Now, I hear the DIRECT_IO thing and I assume we're eventually going to have to go that way: Linux kernel developers seem to think that &quot;real men use O_DIRECT&quot; and so if other forms of I/O don't provide useful guarantees, well that's our fault for not using O_DIRECT. That's a political reason, not a technical reason, but it's a reason all the same.</p> </blockquote> <p>I looked into buffered AIO a while ago, by the way, and just ... hell no. Run, run as fast as you can.</p> <p>The trouble with direct I/O is that it pushes a <em>lot</em> of work back on PostgreSQL regarding knowledge of the storage subsystem, I/O scheduling, etc. It's absurd to have the kernel do this, unless you want it reliable, in which case you bypass it and drive the hardware directly.</p> <p>We'd need pools of writer threads to deal with all the blocking I/O. It'd be such a nightmare. Hey, why bother having a kernel at all, except for drivers?</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-04 02:44:22 </code></pre> <p>On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:</p> <blockquote> <p>On Tue, Apr 3, 2018 at 10:05:19PM -0400, Bruce Momjian wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:</p> <blockquote> <p>I believe there were some problems of that nature (with various twists, based on other concurrent activity and possibly different fds), and those problems were fixed by the errseq_t system developed by Jeff Layton in Linux 4.13. Call that &quot;bug #1&quot;.</p> </blockquote> <p>So all our non-cutting-edge Linux systems are vulnerable and there is no workaround Postgres can implement? Wow.</p> </blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> </blockquote> <p>I'm not sure of anything. I can see that it's designed to report errors since the last fsync() of the <em>file</em> (presumably via any fd), which sounds like the desired behaviour:</p> <p><a href="https://github.com/torvalds/linux/blob/master/mm/filemap.c#L682">https://github.com/torvalds/linux/blob/master/mm/filemap.c#L682</a></p> <blockquote> <p>When userland calls fsync (or something like nfsd does the equivalent), we want to report any writeback errors that occurred since the last fsync (or since the file was opened if there haven't been any).</p> </blockquote> <p>But I'm not sure what the lifetime of the passed-in &quot;file&quot; and more importantly &quot;file-&gt;f_wb_err&quot; is. Specifically, what happens to it if no one has the file open at all, between operations? It is reference counted, see fs/file_table.c. I don't know enough about it to comment.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-04 05:29:28 </code></pre> <p>On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:</p> <blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> </blockquote> <p>I'm not sure of anything. I can see that it's designed to report errors since the last fsync() of the <em>file</em> (presumably via any fd), which sounds like the desired behaviour:</p> <p>[..]</p> </blockquote> <p>Scratch that. Whenever you open a file descriptor you can't see any preceding errors at all, because:</p> <pre><code>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping); </code></pre> <p><a href="https://github.com/torvalds/linux/blob/master/fs/open.c#L752">https://github.com/torvalds/linux/blob/master/fs/open.c#L752</a></p> <p>Our whole design is based on being able to open, close and reopen files at will from any process, and in particular to fsync() from a different process that didn't inherit the fd but instead opened it later. But it looks like that might be able to eat errors that occurred during asynchronous writeback (when there was nobody to report them to), before you opened the file?</p> <p>If so I'm not sure how that can possibly be considered to be an implementation of _POSIX_SYNCHRONIZED_IO: &quot;the fsync() function shall force all currently queued I/O operations associated with the file indicated by file descriptor fildes to the synchronized I/O completion state.&quot; Note &quot;the file&quot;, not &quot;this file descriptor + copies&quot;, and without reference to when you opened it.</p> <blockquote> <p>But I'm not sure what the lifetime of the passed-in &quot;file&quot; and more importantly &quot;file-&gt;f_wb_err&quot; is.</p> </blockquote> <p>It's really inode-&gt;i_mapping-&gt;wb_err's lifetime that I should have been asking about there, not file-&gt;f_wb_err, but I see now that that question is irrelevant due to the above.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 06:00:21 </code></pre> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:</p> <blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> </blockquote> <p>I'm not sure of anything. I can see that it's designed to report errors since the last fsync() of the <em>file</em> (presumably via any fd), which sounds like the desired behaviour:</p> <p>[..]</p> </blockquote> <p>Scratch that. Whenever you open a file descriptor you can't see any preceding errors at all, because:</p> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p><a href="https://github.com/torvalds/linux/blob/master/fs/open.c#L752">https://github.com/torvalds/linux/blob/master/fs/open.c#L752</a></p> <p>Our whole design is based on being able to open, close and reopen files at will from any process, and in particular to fsync() from a different process that didn't inherit the fd but instead opened it later. But it looks like that might be able to eat errors that occurred during asynchronous writeback (when there was nobody to report them to), before you opened the file?</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> <p>I'll see if I can expand my testcase for that. I'm presently dockerizing it to make it easier for others to use, but that turns out to be a major pain when using devmapper etc. Docker in privileged mode doesn't seem to play nice with device-mapper.</p> <p>Does that mean that the ONLY ways to do reliable I/O are:</p> <ul> <li>single-process, single-file-descriptor write() then fsync(); on failure, retry all work since last successful fsync()</li> <li>direct I/O</li> </ul> <p>?</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-04 07:32:04 </code></pre> <p>On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p>[...]</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> </blockquote> <p>Predates the opening of the file by the process that calls fsync(). Yeah, it sure looks that way based on the above code fragment. Does anyone know better?</p> <blockquote> <p>Does that mean that the ONLY ways to do reliable I/O are:</p> <ul> <li>single-process, single-file-descriptor write() then fsync(); on failure, retry all work since last successful fsync()</li> </ul> </blockquote> <p>I suppose you could some up with some crazy complicated IPC scheme to make sure that the checkpointer always has an fd older than any writes to be flushed, with some fallback strategy for when it can't take any more fds.</p> <p>I haven't got any good ideas right now.</p> <blockquote> <ul> <li>direct I/O</li> </ul> </blockquote> <p>As a bit of an aside, I gather that when you resize files (think truncating/extending relation files) you still need to call fsync() even if you read/write all data with O_DIRECT, to make it flush the filesystem meta-data. I have no idea if that could also be affected by eaten writeback errors.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 07:51:53 </code></pre> <p>On 4 April 2018 at 14:00, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:</p> <blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> </blockquote> <p>I'm not sure of anything. I can see that it's designed to report errors since the last fsync() of the <em>file</em> (presumably via any fd), which sounds like the desired behaviour:</p> <p>[..]</p> </blockquote> <p>Scratch that. Whenever you open a file descriptor you can't see any preceding errors at all, because:</p> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p><a href="https://github.com/torvalds/linux/blob/master/fs/open.c#L752">https://github.com/torvalds/linux/blob/master/fs/open.c#L752</a></p> <p>Our whole design is based on being able to open, close and reopen files at will from any process, and in particular to fsync() from a different process that didn't inherit the fd but instead opened it later. But it looks like that might be able to eat errors that occurred during asynchronous writeback (when there was nobody to report them to), before you opened the file?</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> <p>I'll see if I can expand my testcase for that. I'm presently dockerizing it to make it easier for others to use, but that turns out to be a major pain when using devmapper etc. Docker in privileged mode doesn't seem to play nice with device-mapper.</p> </blockquote> <p>Done, you can find it in <a href="https://github.com/ringerc/scrapcode/tree/master/testcases/fsync-error-clear">https://github.com/ringerc/scrapcode/tree/master/testcases/fsync-error-clear</a> now.</p> <p>Warning, this runs a Docker container in privileged mode on your system, and it uses devicemapper. Read it before you run it, and while I've tried to keep it safe, beware that it might eat your system.</p> <p>For now it tests only xfs and EIO. Other FSs should be easy enough.</p> <p>I haven't added coverage for multi-processing yet, but given what you found above, I should. I'll probably just system() a copy of the same proc with instructions to only fsync(). I'll do that next.</p> <p>I haven't worked out a reliable way to trigger ENOSPC on fsync() yet, when mapping without the error hole. It happens sometimes but I don't know why, it almost always happens on write() instead. I know it can happen on nfs, but I'm hoping for a saner example than that to test with. ext4 and xfs do delayed allocation but eager reservation so it shouldn't happen to them.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 13:49:38 </code></pre> <p>On Wed, Apr 4, 2018 at 07:32:04PM +1200, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p>[...]</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> </blockquote> <p>Predates the opening of the file by the process that calls fsync(). Yeah, it sure looks that way based on the above code fragment. Does anyone know better?</p> </blockquote> <p>Uh, just to clarify, what is new here is that it is ignoring any <em>errors</em> that happened before the open(). It is not ignoring write()'s that happened but have not been written to storage before the open().</p> <p>FYI, pg_test_fsync has always tested the ability to fsync() writes() from from other processes:</p> <pre><code>Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close 5360.341 ops/sec 187 usecs/op write, close, fsync 4785.240 ops/sec 209 usecs/op </code></pre> <p>Those two numbers should be similar. I added this as a check to make sure the behavior we were relying on was working. I never tested sync errors though.</p> <p>I think the fundamental issue is that we always assumed that writes to the kernel that could not be written to storage would remain in the kernel until they succeeded, and that fsync() would report their existence.</p> <p>I can understand why kernel developers don't want to keep failed sync buffers in memory, and once they are gone we lose reporting of their failure. Also, if the kernel is going to not retry the syncs, how long should it keep reporting the sync failure? To the first fsync that happens after the failure? How long should it continue to record the failure? What if no fsync() every happens, which is likely for non-Postgres workloads? I think once they decided to discard failed syncs and not retry them, the fsync behavior we are complaining about was almost required.</p> <p>Our only option might be to tell administrators to closely watch for kernel write failure messages, and then restore or failover. :-(</p> <p>The last time I remember being this surprised about storage was in the early Postgres years when we learned that just because the BSD file system uses 8k pages doesn't mean those are atomically written to storage. We knew the operating system wrote the data in 8k chunks to storage but:</p> <ul> <li>the 8k pages are written as separate 512-byte sectors</li> <li>the 8k might be contiguous logically on the drive but not physically</li> <li>even 512-byte sectors are not written atomically</li> </ul> <p>This is why we added pre-page images are written to WAL, which is what full_page_writes controls.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 13:53:01 </code></pre> <p>On Wed, Apr 4, 2018 at 10:40:16AM +0800, Craig Ringer wrote:</p> <blockquote> <p>The trouble with direct I/O is that it pushes a <em>lot</em> of work back on PostgreSQL regarding knowledge of the storage subsystem, I/O scheduling, etc. It's absurd to have the kernel do this, unless you want it reliable, in which case you bypass it and drive the hardware directly.</p> <p>We'd need pools of writer threads to deal with all the blocking I/O. It'd be such a nightmare. Hey, why bother having a kernel at all, except for drivers?</p> </blockquote> <p>I believe this is how Oracle views the kernel, so there is precedent for this approach, though I am not advocating it.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 14:00:15 </code></pre> <p>On 4 April 2018 at 15:51, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 14:00, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:</p> <blockquote> <p>Uh, are you sure it fixes our use-case? From the email description it sounded like it only reported fsync errors for every open file descriptor at the time of the failure, but the checkpoint process might open the file <em>after</em> the failure and try to fsync a write that happened <em>before</em> the failure.</p> </blockquote> <p>I'm not sure of anything. I can see that it's designed to report errors since the last fsync() of the <em>file</em> (presumably via any fd), which sounds like the desired behaviour:</p> <p>[..]</p> </blockquote> <p>Scratch that. Whenever you open a file descriptor you can't see any preceding errors at all, because:</p> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p><a href="https://github.com/torvalds/linux/blob/master/fs/open.c#L752">https://github.com/torvalds/linux/blob/master/fs/open.c#L752</a></p> <p>Our whole design is based on being able to open, close and reopen files at will from any process, and in particular to fsync() from a different process that didn't inherit the fd but instead opened it later. But it looks like that might be able to eat errors that occurred during asynchronous writeback (when there was nobody to report them to), before you opened the file?</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> <p>I'll see if I can expand my testcase for that. I'm presently dockerizing it to make it easier for others to use, but that turns out to be a major pain when using devmapper etc. Docker in privileged mode doesn't seem to play nice with device-mapper.</p> </blockquote> <p>Done, you can find it in <a href="https://github.com/ringerc/scrapcode/tree/master/">https://github.com/ringerc/scrapcode/tree/master/</a> testcases/fsync-error-clear now.</p> </blockquote> <p>Update. Now supports multiple FSes.</p> <p>I've tried xfs, jfs, ext3, ext4, even vfat. All behave the same on EIO. Didn't try zfs-on-linux or other platforms yet.</p> <p>Still working on getting ENOSPC on fsync() rather than write(). Kernel code reading suggests this is possible, but all the above FSes reserve space eagerly on write( ) even if they do delayed allocation of the actual storage, so it doesn't seem to happen at least in my simple single-process test.</p> <p>I'm not overly inclined to complain about a fsync() succeeding after a write() error. That seems reasonable enough, the kernel told the app at the time of the failure. What else is it going to do? I don't personally even object hugely to the current fsync() behaviour if it were, say, DOCUMENTED and conformant to the relevant standards, though not giving us any sane way to find out the affected file ranges makes it drastically harder to recover sensibly.</p> <p>But what's come out since on this thread, that we cannot even rely on fsync() giving us an EIO <em>once</em> when it loses our data, because:</p> <ul> <li>all currently widely deployed kernels can fail to deliver info due to recently fixed limitation; and</li> <li>the kernel deliberately hides errors from us if they relate to writes that occurred before we opened the FD (?)</li> </ul> <p>... that's really troubling. I thought we could at least fix this by PANICing on EIO, and was mostly worried about ENOSPC. But now it seems we can't even do that and expect reliability. So how the @#$ are we meant to do?</p> <p>It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 14:09:09 </code></pre> <p>On 4 April 2018 at 22:00, Craig Ringer wrote:</p> <blockquote> <p>It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly.</p> </blockquote> <p>Also, just to be clear, this is not in any way confined to xfs and/or lvm as I originally thought it might be.</p> <p>Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't help either (so what does it do?).</p> <p>What bewilders me is that running with data=journal doesn't seem to be safe either. WTF?</p> <pre><code>[26438.846111] EXT4-fs (dm-0): mounted filesystem with journalled data mode. Opts: errors=remount-ro,data_err=abort,data=journal [26454.125319] EXT4-fs warning (device dm-0): ext4_end_bio:323: I/O error 10 writing to inode 12 (offset 0 size 0 starting block 59393) [26454.125326] Buffer I/O error on device dm-0, logical block 59393 [26454.125337] Buffer I/O error on device dm-0, logical block 59394 [26454.125343] Buffer I/O error on device dm-0, logical block 59395 [26454.125350] Buffer I/O error on device dm-0, logical block 59396 </code></pre> <p>and splat, there goes your data anyway.</p> <p>It's possible that this is in some way related to using the device-mapper &quot;error&quot; target and a loopback device in testing. But I don't really see how.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 14:25:47 </code></pre> <p>On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 22:00, Craig Ringer wrote:</p> <pre><code>It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly. </code></pre> <p>Also, just to be clear, this is not in any way confined to xfs and/or lvm as I originally thought it might be.</p> <p>Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't help either (so what does it do?).</p> </blockquote> <p>Anthony Iliopoulos reported in this thread that errors=remount-ro is only affected by metadata writes.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 14:42:18 </code></pre> <p>On 4 April 2018 at 22:25, Bruce Momjian wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 22:00, Craig Ringer wrote:</p> <pre><code>It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly. </code></pre> <p>Also, just to be clear, this is not in any way confined to xfs and/or lvm as I originally thought it might be.</p> <p>Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't help either (so what does it do?).</p> </blockquote> <p>Anthony Iliopoulos reported in this thread that errors=remount-ro is only affected by metadata writes.</p> </blockquote> <p>Yep, I gathered. I was referring to data_err.</p> <hr /> <pre><code>From:Antonis Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-04 15:23:31 </code></pre> <p>On Wed, Apr 4, 2018 at 4:42 PM, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 22:25, Bruce Momjian wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 22:00, Craig Ringer wrote:</p> <p>It's the error reporting issues around closing and reopening files with outstanding buffered I/O that's really going to hurt us here. I'll be expanding my test case to cover that shortly.</p> <p>Also, just to be clear, this is not in any way confined to xfs and/or lvm as I originally thought it might be.</p> <p>Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't help either (so what does it do?).</p> </blockquote> <p>Anthony Iliopoulos reported in this thread that errors=remount-ro is only affected by metadata writes.</p> </blockquote> <p>Yep, I gathered. I was referring to data_err.</p> </blockquote> <p>As far as I recall data_err=abort pertains to the jbd2 handling of potential writeback errors. Jbd2 will inetrnally attempt to drain the data upon txn commit (and it's even kind enough to restore the EIO at the address space level, that otherwise would get eaten).</p> <p>When data_err=abort is set, then jbd2 forcibly shuts down the entire journal, with the error being propagated upwards to ext4. I am not sure at which point this would be manifested to userspace and how, but in principle any subsequent fs operations would get some filesystem error due to the journal being down (I would assume similar to remounting the fs read-only).</p> <p>Since you are using data=journal, I would indeed expect to see something more than what you saw in dmesg.</p> <p>I can have a look later, I plan to also respond to some of the other interesting issues that you guys raised in the thread.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-04 15:23:51 </code></pre> <p>On 4 April 2018 at 21:49, Bruce Momjian wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 07:32:04PM +1200, Thomas Munro wrote:</p> <blockquote> <p>On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 13:29, Thomas Munro wrote:</p> <blockquote> <p>/* Ensure that we skip any errors that predate opening of the file */ f-&gt;f_wb_err = filemap_sample_wb_err(f-&gt;f_mapping);</p> <p>[...]</p> </blockquote> <p>Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us?</p> </blockquote> <p>Predates the opening of the file by the process that calls fsync(). Yeah, it sure looks that way based on the above code fragment. Does anyone know better?</p> </blockquote> <p>Uh, just to clarify, what is new here is that it is ignoring any <em>errors</em> that happened before the open(). It is not ignoring write()'s that happened but have not been written to storage before the open().</p> <p>FYI, pg_test_fsync has always tested the ability to fsync() writes() from from other processes:</p> <pre><code> Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a </code></pre> <p>different descriptor.) write, fsync, close 5360.341 ops/sec 187 usecs/op write, close, fsync 4785.240 ops/sec 209 usecs/op</p> <p>Those two numbers should be similar. I added this as a check to make sure the behavior we were relying on was working. I never tested sync errors though.</p> <p>I think the fundamental issue is that we always assumed that writes to the kernel that could not be written to storage would remain in the kernel until they succeeded, and that fsync() would report their existence.</p> <p>I can understand why kernel developers don't want to keep failed sync buffers in memory, and once they are gone we lose reporting of their failure. Also, if the kernel is going to not retry the syncs, how long should it keep reporting the sync failure?</p> </blockquote> <p>Ideally until the app tells it not to.</p> <p>But there's no standard API for that.</p> <p>The obvious answer seems to be &quot;until the FD is closed&quot;. But we just discussed how Pg relies on being able to open and close files freely. That may not be as reasonable a thing to do as we thought it was when you consider error reporting. What's the kernel meant to do? How long should it remember &quot;I had an error while doing writeback on this file&quot;? Should it flag the file metadata and remember across reboots? Obviously not, but where does it stop? Tell the next program that does an fsync() and forget? How could it associate a dirty buffer on a file with no open FDs with any particular program at all? And what if the app did a write then closed the file and went away, never to bother to check the file again, like most apps do?</p> <p>Some I/O errors are transient (network issue, etc). Some are recoverable with some sort of action, like disk space issues, but may take a long time before an admin steps in. Some are entirely unrecoverable (disk 1 in striped array is on fire) and there's no possible recovery. Currently we kind of hope the kernel will deal with figuring out which is which and retrying. Turns out it doesn't do that so much, and I don't think the reasons for that are wholly unreasonable. We may have been asking too much.</p> <p>That does leave us in a pickle when it comes to the checkpointer and opening/closing FDs. I don't know what the &quot;right&quot; thing for the kernel to do from our perspective even is here, but the best I can come up with is actually pretty close to what it does now. Report the fsync() error to the first process that does an fsync() since the writeback error if one has occurred, then forget about it. Ideally I'd have liked it to mark all FDs pointing to the file with a flag to report EIO on next fsync too, but it turns out that won't even help us due to our opening and closing behaviour, so we're going to have to take responsibility for handling and communicating that ourselves, preventing checkpoint completion if any backend gets an fsync error. Probably by PANICing. Some extra work may be needed to ensure reliable ordering and stop checkpoints completing if their fsync() succeeds due to a recent failed fsync() on a normal backend that hasn't PANICed or where the postmaster hasn't noticed yet.</p> <p>Our only option might be to tell administrators to closely watch for &gt; kernel write failure messages, and then restore or failover. :-( &gt;</p> <p>Speaking of, there's not necessarily any lost page write error in the logs AFAICS. My tests often just show &quot;Buffer I/O error on device dm-0, logical block 59393&quot; or the like.</p> <hr /> <pre><code>From:Gasper Zejn &lt;zejn(at)owca(dot)info&gt; Date:2018-04-04 17:23:58 </code></pre> <p>On 04. 04. 2018 15:49, Bruce Momjian wrote:</p> <blockquote> <p>I can understand why kernel developers don't want to keep failed sync buffers in memory, and once they are gone we lose reporting of their failure. Also, if the kernel is going to not retry the syncs, how long should it keep reporting the sync failure? To the first fsync that happens after the failure? How long should it continue to record the failure? What if no fsync() every happens, which is likely for non-Postgres workloads? I think once they decided to discard failed syncs and not retry them, the fsync behavior we are complaining about was almost required.</p> </blockquote> <p>Ideally the kernel would keep its data for as little time as possible. With fsync, it doesn't really know which process is interested in knowing about a write error, it just assumes the caller will know how to deal with it. Most unfortunate issue is there's no way to get information about a write error.</p> <p>Thinking aloud - couldn't/shouldn't a write error also be a file system event reported by inotify? Admittedly that's only a thing on Linux, but still.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-04 17:51:03 </code></pre> <p>On Wed, Apr 4, 2018 at 11:23:51PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 4 April 2018 at 21:49, Bruce Momjian wrote:</p> <blockquote> <p>I can understand why kernel developers don't want to keep failed sync buffers in memory, and once they are gone we lose reporting of their failure. Also, if the kernel is going to not retry the syncs, how long should it keep reporting the sync failure?</p> </blockquote> <p>Ideally until the app tells it not to.</p> <p>But there's no standard API for that.</p> </blockquote> <p>You would almost need an API that registers <em>before</em> the failure that you care about sync failures, and that you plan to call fsync() to gather such information. I am not sure how you would allow more than the first fsync() to see the failure unless you added <em>another</em> API to clear the fsync failure, but I don't see the point since the first fsync() might call that clear function. How many applications are going to know there is <em>another</em> application that cares about the failure? Not many.</p> <blockquote> <p>Currently we kind of hope the kernel will deal with figuring out which is which and retrying. Turns out it doesn't do that so much, and I don't think the reasons for that are wholly unreasonable. We may have been asking too much.</p> </blockquote> <p>Agreed.</p> <blockquote> <blockquote> <p>Our only option might be to tell administrators to closely watch for kernel write failure messages, and then restore or failover. :-(</p> </blockquote> <p>Speaking of, there's not necessarily any lost page write error in the logs AFAICS. My tests often just show &quot;Buffer I/O error on device dm-0, logical block 59393&quot; or the like.</p> </blockquote> <p>I assume that is the kernel logs. I am thinking the kernel logs have to be monitored, but how many administrators do that? The other issue I think you are pointing out is how is the administrator going to know this is a Postgres file? I guess any sync error to a device that contains Postgres has to assume Postgres is corrupted. :-(</p> <hr /> <p>see explicit treatment of retrying, though I'm not entirely sure if the retry flag is set just for async write-back), and apparently unlike every other kernel I've tried to grok so far (things descended from ancestral BSD but not descended from FreeBSD, with macOS/Darwin apparently in the first category for this purpose).</p> <p>Here's a new ticket in the NetBSD bug database for this stuff:</p> <p><a href="http://gnats.netbsd.org/53152">http://gnats.netbsd.org/53152</a></p> <p>As mentioned in that ticket and by Andres earlier in this thread, keeping the page dirty isn't the only strategy that would work and may be problematic in different ways (it tells the truth but floods your cache with unflushable stuff until eventually you force unmount it and your buffers are eventually invalidated after ENXIO errors? I don't know.). I have no qualified opinion on that. I just know that we need a way for fsync() to tell the truth about all preceding writes or our checkpoints are busted.</p> <p>*We mmap() + msync() in pg_flush_data() if you don't have sync_file_range(), and I see now that that is probably not a great idea on ZFS because you'll finish up double-buffering (or is that triple-buffering?), flooding your page cache with transient data. Oops. That is off-topic and not relevant for the checkpoint correctness topic of this thread through, since pg_flush_data() is advisory only.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-04 22:14:24 </code></pre> <p>On Thu, Apr 5, 2018 at 9:28 AM, Thomas Munro wrote:</p> <blockquote> <p>On Thu, Apr 5, 2018 at 2:00 AM, Craig Ringer wrote:</p> <blockquote> <p>I've tried xfs, jfs, ext3, ext4, even vfat. All behave the same on EIO. Didn't try zfs-on-linux or other platforms yet.</p> </blockquote> <p>While contemplating what exactly it would do (not sure),</p> </blockquote> <p>See manual for failmode=wait | continue | panic. Even &quot;continue&quot; returns EIO to all new write requests, so they apparently didn't bother to supply an 'eat-my-data-but-tell-me-everything-is-fine' mode. Figures.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-05 07:09:57 </code></pre> <p>Summary to date:</p> <p>It's worse than I thought originally, because:</p> <ul> <li>Most widely deployed kernels have cases where they don't tell you about losing your writes at all; and</li> <li>Information about loss of writes can be masked by closing and re-opening a file</li> </ul> <p>So the checkpointer cannot trust that a successful fsync() means ... a successful fsync().</p> <p>Also, it's been reported to me off-list that anyone on the system calling sync(2) or the sync shell command will also generally consume the write error, causing us not to see it when we fsync(). The same is true for /proc/sys/vm/drop_caches. I have not tested these yet.</p> <p>There's some level of agreement that we should PANIC on fsync() errors, at least on Linux, but likely everywhere. But we also now know it's insufficient to be fully protective.</p> <p>I previously though that errors=remount-ro was a sufficient safeguard. It isn't. There doesn't seem to be anything that is, for ext3, ext4, btrfs or xfs.</p> <p>It's not clear to me yet why data_err=abort isn't sufficient in data=ordered or data=writeback mode on ext3 or ext4, needs more digging. (In my test tools that's: make FSTYPE=ext4 MKFSOPTS=&quot;&quot; MOUNTOPTS=&quot;errors=remount-ro, data_err=abort,data=journal&quot; as of the current version d7fe802ec). AFAICS that's because data_error=abort only affects data=ordered, not data=journal. If you use data=ordered, you at least get retries of the same write failing. This post <a href="https://lkml.org/lkml/2008/10/10/80">https://lkml.org/lkml/2008/10/10/80</a> added the option and has some explanation, but doesn't explain why it doesn't affect data=journal.</p> <p>zfs is probably not affected by the issues, per Thomas Munro. I haven't run my test scripts on it yet because my kernel doesn't have zfs support and I'm prioritising the multi-process / open-and-close issues.</p> <p>So far none of the FSes and options I've tried exhibit the behavour I actually want, which is to make the fs readonly or inaccessible on I/O error.</p> <p>ENOSPC doesn't seem to be a concern during normal operation of major file systems (ext3, ext4, btrfs, xfs) because they reserve space before returning from write(). But if a buffered write does manage to fail due to ENOSPC we'll definitely see the same problems. This makes ENOSPC on NFS a potentially data corrupting condition since NFS doesn't preallocate space before returning from write().</p> <p>I think what we really need is a block-layer fix, where an I/O error flips the block device into read-only mode, as if blockdev --setro had been used. Though I'd settle for a kernel panic, frankly. I don't think anybody really wants this, but I'd rather either of those to silent data loss.</p> <p>I'm currently tweaking my test to do some close and reopen the file between each write() and fsync(), and to support running with nfs.</p> <p>I've also just found the device-mapper &quot;flakey&quot; driver, which looks fantastic for simulating unreliable I/O with intermittent faults. I've been using the &quot;error&quot; target in a mapping, which lets me remap some of the device to always error, but &quot;flakey&quot; looks very handy for actual PostgreSQL testing.</p> <p>For the sake of Google, these are errors known to be associated with the problem:</p> <p>ext4, and ext3 mounted with ext4 driver:</p> <pre><code>[42084.327345] EXT4-fs warning (device dm-0): ext4_end_bio:323: I/O error 10 writing to inode 12 (offset 0 size 0 starting block 59393) [42084.327352] Buffer I/O error on device dm-0, logical block 59393 </code></pre> <p>xfs:</p> <pre><code>[42193.771367] XFS (dm-0): writeback error on sector 118784 [42193.784477] XFS (dm-0): writeback error on sector 118784 </code></pre> <p>jfs: (nil, silence in the kernel logs)</p> <p>You should also beware of &quot;lost page write&quot; or &quot;lost write&quot; errors.</p> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-05 08:46:08 </code></pre> <p>On 5 April 2018 at 15:09, Craig Ringer wrote:</p> <blockquote> <p>Also, it's been reported to me off-list that anyone on the system calling sync(2) or the sync shell command will also generally consume the write error, causing us not to see it when we fsync(). The same is true for /proc/sys/vm/drop_caches. I have not tested these yet.</p> </blockquote> <p>I just confirmed this with a tweak to the test that</p> <pre><code>records the file position close()s the fd sync()s open(s) the file lseek()s back to the recorded position </code></pre> <p>This causes the test to completely ignore the I/O error, which is not reported to it at any time.</p> <p>Fair enough, really, when you look at it from the kernel's point of view. What else can it do? Nobody has the file open. It'd have to mark the file its self as bad somehow. But that's pretty bad for our robustness AFAICS.</p> <blockquote> <p>There's some level of agreement that we should PANIC on fsync() errors, at least on Linux, but likely everywhere. But we also now know it's insufficient to be fully protective.</p> </blockquote> <p>If dirty writeback fails between our close() and re-open() I see the same behaviour as with sync(). To test that I set dirty_writeback_centisecs and dirty_expire_centisecs to 1 and added a usleep(3*100*1000) between close() and open(). (It's still plenty slow). So sync() is a convenient way to simulate something other than our own fsync() writing out the dirty buffer.</p> <p>If I omit the sync() then we get the error reported by fsync() once when we re open() the file and fsync() it, because the buffers weren't written out yet, so the error wasn't generated until we re-open()ed the file. But I doubt that'll happen much in practice because dirty writeback will get to it first so the error will be seen and discarded before we reopen the file in the checkpointer.</p> <p>In other words, it looks like <em>even with a new kernel with the error reporting bug fixes</em>, if I understand how the backends and checkpointer interact when it comes to file descriptors, we're unlikely to notice I/O errors and fail a checkpoint. We may notice I/O errors if a backend does its own eager writeback for large I/O operations, or if the checkpointer fsync()s a file before the kernel's dirty writeback gets around to trying to flush the pages that will fail.</p> <p>I haven't tested anything with multiple processes / multiple FDs yet, where we keep one fd open while writing on another.</p> <p>But at this point I don't see any way to make Pg reliably detect I/O errors and fail a checkpoint then redo and retry. To even fix this by PANICing like I proposed originally, we need to know we have to PANIC.</p> <p>AFAICS it's completely unsafe to write(), close(), open() and fsync() and expect that the fsync() makes any promises about the write(). Which if I read Pg's low level storage code right, makes it completely unable to reliably detect I/O errors.</p> <p>When put it that way, it sounds fair enough too. How long is the kernel meant to remember that there was a write error on the file triggered by a write initiated by some seemingly unrelated process, some unbounded time ago, on a since-closed file?</p> <p>But it seems to put Pg on the fast track to O_DIRECT.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-05 19:33:14 </code></pre> <p>On Thu, Apr 5, 2018 at 03:09:57PM +0800, Craig Ringer wrote:</p> <blockquote> <p>ENOSPC doesn't seem to be a concern during normal operation of major file systems (ext3, ext4, btrfs, xfs) because they reserve space before returning from write(). But if a buffered write does manage to fail due to ENOSPC we'll definitely see the same problems. This makes ENOSPC on NFS a potentially data corrupting condition since NFS doesn't preallocate space before returning from write().</p> </blockquote> <p>This does explain why NFS has a reputation for unreliability for Postgres.</p> <hr /> <pre><code>From:Andrew Gierth &lt;andrew(at)tao11(dot)riddles(dot)org(dot)uk&gt; Date:2018-04-05 23:37:42 </code></pre> <p>Note: as I've brought up in another thread, it turns out that PG is not handling fsync errors correctly even when the OS <em>does</em> do the right thing (discovered by testing on FreeBSD).</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-06 01:27:05 </code></pre> <p>On 6 April 2018 at 07:37, Andrew Gierth wrote:</p> <blockquote> <p>Note: as I've brought up in another thread, it turns out that PG is not handling fsync errors correctly even when the OS <em>does</em> do the right thing (discovered by testing on FreeBSD).</p> </blockquote> <p>Yikes. For other readers, the related thread for this is</p> <p>Meanwhile, I've extended my test to run postgres on a deliberately faulty volume and confirmed my results there.</p> <pre><code>2018-04-06 01:11:40.555 UTC [58] LOG: checkpoint starting: immediate force wait 2018-04-06 01:11:40.567 UTC [58] ERROR: could not fsync file &quot;base/12992/16386&quot;: Input/output error 2018-04-06 01:11:40.655 UTC [66] ERROR: checkpoint request failed 2018-04-06 01:11:40.655 UTC [66] HINT: Consult recent messages in the server log for details. 2018-04-06 01:11:40.655 UTC [66] STATEMENT: CHECKPOINT Checkpoint failed with checkpoint request failed HINT: Consult recent messages in the server log for details. Retrying 2018-04-06 01:11:41.568 UTC [58] LOG: checkpoint starting: immediate force wait 2018-04-06 01:11:41.614 UTC [58] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.000 s, total=0.046 s; sync files=3, longest=0.000 s, average=0.000 s; distance=2727 kB, estimate=2779 kB </code></pre> <p>Given your report, now I have to wonder if we even reissued the fsync() at all this time. 'perf' time. OK, with</p> <pre><code>sudo perf record -e syscalls:sys_enter_fsync,syscalls:sys_exit_fsync -a sudo perf script </code></pre> <p>I see the failed fync, then the same fd being fsync()d without error on the next checkpoint, which succeeds.</p> <pre><code> postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd: 0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync: 0xfffffffffffffffb ... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd: 0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0 </code></pre> <p>... and Pg continues merrily on its way without realising it lost data:</p> <pre><code>[72379.834872] XFS (dm-0): writeback error on sector 118752 [72380.324707] XFS (dm-0): writeback error on sector 118688 </code></pre> <p>In this test I set things up so the checkpointer would see the first fsync() error. But if I make checkpoints less frequent, the bgwriter aggressive, and kernel dirty writeback aggressive, it should be possible to have the failure go completely unobserved too. I'll try that next, because we've already largely concluded that the solution to the issue above is to PANIC on fsync() error. But if we don't see the error at all we're in trouble.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-06 02:53:56 </code></pre> <p>On Fri, Apr 6, 2018 at 1:27 PM, Craig Ringer wrote:</p> <blockquote> <p>On 6 April 2018 at 07:37, Andrew Gierth wrote:</p> <blockquote> <p>Note: as I've brought up in another thread, it turns out that PG is not handling fsync errors correctly even when the OS <em>does</em> do the right thing (discovered by testing on FreeBSD).</p> </blockquote> <p>Yikes. For other readers, the related thread for this is</p> </blockquote> <p>Yeah. That's really embarrassing, especially after beating up on various operating systems all week. It's also an independent issue -- let's keep that on the other thread and get it fixed.</p> <blockquote> <p>I see the failed fync, then the same fd being fsync()d without error on the next checkpoint, which succeeds.</p> <pre><code> postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd: </code></pre> <p>0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync: 0xfffffffffffffffb ... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd: 0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0</p> <p>... and Pg continues merrily on its way without realising it lost data:</p> <p>[72379.834872] XFS (dm-0): writeback error on sector 118752 [72380.324707] XFS (dm-0): writeback error on sector 118688</p> <p>In this test I set things up so the checkpointer would see the first fsync() error. But if I make checkpoints less frequent, the bgwriter aggressive, and kernel dirty writeback aggressive, it should be possible to have the failure go completely unobserved too. I'll try that next, because we've already largely concluded that the solution to the issue above is to PANIC on fsync() error. But if we don't see the error at all we're in trouble.</p> </blockquote> <p>I suppose you only see errors because the file descriptors linger open in the virtual file descriptor cache, which is a matter of luck depending on how many relation segment files you touched. One thing you could try to confirm our understand of the Linux 4.13+ policy would be to hack PostgreSQL so that it reopens the file descriptor every time in mdsync(). See attached.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-06 03:20:22 </code></pre> <p>On 6 April 2018 at 10:53, Thomas Munro wrote:</p> <blockquote> <p>On Fri, Apr 6, 2018 at 1:27 PM, Craig Ringer wrote:</p> <blockquote> <p>On 6 April 2018 at 07:37, Andrew Gierth wrote:</p> <blockquote> <p>Note: as I've brought up in another thread, it turns out that PG is not handling fsync errors correctly even when the OS <em>does</em> do the right thing (discovered by testing on FreeBSD).</p> </blockquote> <p>Yikes. For other readers, the related thread for this is news-spur.riddles.org.uk</p> </blockquote> <p>Yeah. That's really embarrassing, especially after beating up on various operating systems all week. It's also an independent issue -- let's keep that on the other thread and get it fixed.</p> <blockquote> <p>I see the failed fync, then the same fd being fsync()d without error on the next checkpoint, which succeeds.</p> <pre><code> postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd: </code></pre> <p>0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync: 0xfffffffffffffffb ... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd: 0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0</p> <p>... and Pg continues merrily on its way without realising it lost data:</p> <p>[72379.834872] XFS (dm-0): writeback error on sector 118752 [72380.324707] XFS (dm-0): writeback error on sector 118688</p> <p>In this test I set things up so the checkpointer would see the first fsync() error. But if I make checkpoints less frequent, the bgwriter aggressive, and kernel dirty writeback aggressive, it should be possible to have the failure go completely unobserved too. I'll try that next, because we've already largely concluded that the solution to the issue above is to PANIC on fsync() error. But if we don't see the error at all we're in trouble.</p> </blockquote> <p>I suppose you only see errors because the file descriptors linger open in the virtual file descriptor cache, which is a matter of luck depending on how many relation segment files you touched.</p> </blockquote> <p>In this case I think it's because the kernel didn't get around to doing the writeback before the eagerly forced checkpoint fsync()'d it. Or we didn't even queue it for writeback from our own shared_buffers until just before we fsync()'d it. After all, it's a contrived test case that tries to reproduce the issue rapidly with big writes and frequent checkpoints.</p> <p>So the checkpointer had the relation open to fsync() it, and it was the checkpointer's fsync() that did writeback on the dirty page and noticed the error.</p> <p>If we the kernel had done the writeback before the checkpointer opened the relation to fsync() it, we might not have seen the error at all - though as you note this depends on the file descriptor cache. You can see the silent-error behaviour in my standalone test case where I confirmed the post-4.13 behaviour. (I'm on 4.14 here).</p> <p>I can try to reproduce it with postgres too, but it not only requires closing and reopening the FDs, it also requires forcing writeback before opening the fd. To make it occur in a practical timeframe I have to make my kernel writeback settings insanely aggressive and/or call sync() before re-open()ing. I don't really think it's worth it, since I've confirmed the behaviour already with the simpler test in standalone/ in the rest repo. To try it yourself, clone</p> <p><a href="https://github.com/ringerc/scrapcode">https://github.com/ringerc/scrapcode</a></p> <p>and in the master branch</p> <pre><code>cd testcases/fsync-error-clear less README make REOPEN=reopen standalone-run </code></pre> <p>See <a href="https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear/standalone/fsync-error-clear.c#L118">https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear/standalone/fsync-error-clear.c#L118</a> .</p> <p>I've pushed the postgres test to that repo too; &quot;make postgres-run&quot;.</p> <p>You'll need docker, and be warned, it's using privileged docker containers and messing with dmsetup.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-08 02:16:07 </code></pre> <p>So, what can we actually do about this new Linux behaviour?</p> <p>Idea 1:</p> <ul> <li>whenever you open a file, either tell the checkpointer so it can open it too (and wait for it to tell you that it has done so, because it's not safe to write() until then), or send it a copy of the file descriptor via IPC (since duplicated file descriptors share the same f_wb_err)</li> <li>if the checkpointer can't take any more file descriptors (how would that limit even work in the IPC case?), then it somehow needs to tell you that so that you know that you're responsible for fsyncing that file yourself, both on close (due to fd cache recycling) and also when the checkpointer tells you to</li> </ul> <p>Maybe it could be made to work, but sheesh, that seems horrible. Is there some simpler idea along these lines that could make sure that fsync() is only ever called on file descriptors that were opened before all unflushed writes, or file descriptors cloned from such file descriptors?</p> <p>Idea 2:</p> <p>Give up, complain that this implementation is defective and unworkable, both on POSIX-compliance grounds and on POLA grounds, and campaign to get it fixed more fundamentally (actual details left to the experts, no point in speculating here, but we've seen a few approaches that work on other operating systems including keeping buffers dirty and marking the whole filesystem broken/read-only).</p> <p>Idea 3:</p> <p>Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.</p> <p>Any other ideas?</p> <p>For a while I considered suggesting an idea which I now think doesn't work. I thought we could try asking for a new fcntl interface that spits out wb_err counter. Call it an opaque error token or something. Then we could store it in our fsync queue and safely close the file. Check again before fsync()ing, and if we ever see a different value, PANIC because it means a writeback error happened while we weren't looking. Sadly I think it doesn't work because AIUI inodes are not pinned in kernel memory when no one has the file open and there are no dirty buffers, so I think the counters could go away and be reset. Perhaps you could keep inodes pinned by keeping the associated buffers dirty after an error (like FreeBSD), but if you did that you'd have solved the problem already and wouldn't really need the wb_err system at all. Is there some other idea long these lines that could work?</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-08 02:33:37 </code></pre> <p>On Sun, Apr 8, 2018 at 02:16:07PM +1200, Thomas Munro wrote:</p> <blockquote> <p>So, what can we actually do about this new Linux behaviour?</p> <p>Idea 1:</p> <ul> <li><p>whenever you open a file, either tell the checkpointer so it can open it too (and wait for it to tell you that it has done so, because it's not safe to write() until then), or send it a copy of the file descriptor via IPC (since duplicated file descriptors share the same f_wb_err)</p></li> <li><p>if the checkpointer can't take any more file descriptors (how would that limit even work in the IPC case?), then it somehow needs to tell you that so that you know that you're responsible for fsyncing that file yourself, both on close (due to fd cache recycling) and also when the checkpointer tells you to</p></li> </ul> <p>Maybe it could be made to work, but sheesh, that seems horrible. Is there some simpler idea along these lines that could make sure that fsync() is only ever called on file descriptors that were opened before all unflushed writes, or file descriptors cloned from such file descriptors?</p> <p>Idea 2:</p> <p>Give up, complain that this implementation is defective and unworkable, both on POSIX-compliance grounds and on POLA grounds, and campaign to get it fixed more fundamentally (actual details left to the experts, no point in speculating here, but we've seen a few approaches that work on other operating systems including keeping buffers dirty and marking the whole filesystem broken/read-only).</p> <p>Idea 3:</p> <p>Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.</p> </blockquote> <p>Idea 4 would be for people to assume their database is corrupt if their server logs report any I/O error on the file systems Postgres uses.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 02:37:47 </code></pre> <blockquote> <p>On Apr 7, 2018, at 19:33, Bruce Momjian wrote: Idea 4 would be for people to assume their database is corrupt if their server logs report any I/O error on the file systems Postgres uses.</p> </blockquote> <p>Pragmatically, that's where we are right now. The best answer in this bad situation is (a) fix the error, then (b) replay from a checkpoint before the error occurred, but it appears we can't even guarantee that a PostgreSQL process will be the one to see the error.</p> <p>-- -- Christophe Pettus xof(at)thebuild(dot)com</p> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-08 03:27:45 </code></pre> <p>On 8 April 2018 at 10:16, Thomas Munro wrote:</p> <blockquote> <p>So, what can we actually do about this new Linux behaviour?</p> </blockquote> <p>Yeah, I've been cooking over that myself.</p> <p>More below, but here's an idea #5: decide InnoDB has the right idea, and go to using a single massive blob file, or a few giant blobs.</p> <p>We have a storage abstraction that makes this way, way less painful than it should be.</p> <p>We can virtualize relfilenodes into storage extents in relatively few big files. We could use sparse regions to make the addressing more convenient, but that makes copying and backup painful, so I'd rather not.</p> <p>Even one file per tablespace for persistent relation heaps, another for indexes, another for each fork type.</p> <p>That way we can use something like your #1 (which is what I was also thinking about then rejecting previously), but reduce the pain by reducing the FD count drastically so exhausting FDs stops being a problem.</p> <p>Previously I was leaning toward what you've described here:</p> <blockquote> <ul> <li><p>whenever you open a file, either tell the checkpointer so it can open it too (and wait for it to tell you that it has done so, because it's not safe to write() until then), or send it a copy of the file descriptor via IPC (since duplicated file descriptors share the same f_wb_err)</p></li> <li><p>if the checkpointer can't take any more file descriptors (how would that limit even work in the IPC case?), then it somehow needs to tell you that so that you know that you're responsible for fsyncing that file yourself, both on close (due to fd cache recycling) and also when the checkpointer tells you to</p></li> </ul> <p>Maybe it could be made to work, but sheesh, that seems horrible. Is there some simpler idea along these lines that could make sure that fsync() is only ever called on file descriptors that were opened before all unflushed writes, or file descriptors cloned from such file descriptors?</p> </blockquote> <p>... and got stuck on &quot;yuck, that's awful&quot;.</p> <p>I was assuming we'd force early checkpoints if the checkpointer hit its fd limit, but that's even worse.</p> <p>We'd need to urgently do away with segmented relations, and partitions would start to become a hinderance.</p> <p>Even then it's going to be an unworkable nightmare with heavily partitioned systems, systems that use schema-sharding, etc. And it'll mean we need to play with process limits and, often, system wide limits on FDs. I imagine the performance implications won't be pretty.</p> <p>Idea 2:</p> <blockquote> <p>Give up, complain that this implementation is defective and unworkable, both on POSIX-compliance grounds and on POLA grounds, and campaign to get it fixed more fundamentally (actual details left to the experts, no point in speculating here, but we've seen a few approaches that work on other operating systems including keeping buffers dirty and marking the whole filesystem broken/read-only).</p> </blockquote> <p>This appears to be what SQLite does AFAICS.</p> <p><a href="https://www.sqlite.org/atomiccommit.html">https://www.sqlite.org/atomiccommit.html</a></p> <p>though it has the huge luxury of a single writer, so it's probably only subject to the original issue not the multiprocess / checkpointer issues we face.</p> <blockquote> <p>Idea 3:</p> <p>Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.</p> </blockquote> <p>That seems to be what the kernel folks will expect. But that's going to KILL performance. We'll need writer threads to have any hope of it not <em>totally</em> sucking, because otherwise simple things like updating a heap tuple and two related indexes will incur enormous disk latencies.</p> <p>But I suspect it's the path forward.</p> <p>Goody.</p> <blockquote> <p>Any other ideas?</p> <p>For a while I considered suggesting an idea which I now think doesn't work. I thought we could try asking for a new fcntl interface that spits out wb_err counter. Call it an opaque error token or something. Then we could store it in our fsync queue and safely close the file. Check again before fsync()ing, and if we ever see a different value, PANIC because it means a writeback error happened while we weren't looking. Sadly I think it doesn't work because AIUI inodes are not pinned in kernel memory when no one has the file open and there are no dirty buffers, so I think the counters could go away and be reset. Perhaps you could keep inodes pinned by keeping the associated buffers dirty after an error (like FreeBSD), but if you did that you'd have solved the problem already and wouldn't really need the wb_err system at all. Is there some other idea long these lines that could work?</p> </blockquote> <p>I think our underlying data syncing concept is fundamentally broken, and it's not really the kernel's fault.</p> <p>We assume that we can safely:</p> <pre><code>procA: open() procA: write() procA: close() </code></pre> <p>... some long time later, unbounded as far as the kernel is concerned ...</p> <pre><code>procB: open() procB: fsync() procB: close() </code></pre> <p>If the kernel does writeback in the middle, how on earth is it supposed to know we expect to reopen the file and check back later?</p> <p>Should it just remember &quot;this file had an error&quot; forever, and tell every caller? In that case how could we recover? We'd need some new API to say &quot;yeah, ok already, I'm redoing all my work since the last good fsync() so you can clear the error flag now&quot;. Otherwise it'd keep reporting an error after we did redo to recover, too.</p> <p>I never really clicked to the fact that we closed relations with pending buffered writes, left them closed, then reopened them to fsync. That's .... well, the kernel isn't the only thing doing crazy things here.</p> <p>Right now I think we're at option (4): If you see anything that smells like a write error in your kernel logs, hard-kill postgres with -m immediate (do NOT let it do a shutdown checkpoint). If it did a checkpoint since the logs, fake up a backup label to force redo to start from the last checkpoint before the error. Otherwise, it's safe to just let it start up again and do redo again.</p> <p>Fun times.</p> <p>This also means AFAICS that running Pg on NFS is extremely unsafe, you MUST make sure you don't run out of disk. Because the usual safeguard of space reservation against ENOSPC in fsync doesn't apply to NFS. (I haven't tested this with nfsv3 in sync,hard,nointr mode yet, <em>maybe</em> that's safe, but I doubt it). The same applies to thin-provisioned storage. Just. Don't.</p> <p>This helps explain various reports of corruption in Docker and various other tools that use various sorts of thin provisioning. If you hit ENOSPC in fsync(), bye bye data.</p> <hr /> <pre><code>From:Peter Geoghegan &lt;pg(at)bowt(dot)ie&gt; Date:2018-04-08 03:37:06 </code></pre> <p>On Sat, Apr 7, 2018 at 8:27 PM, Craig Ringer wrote:</p> <blockquote> <p>More below, but here's an idea #5: decide InnoDB has the right idea, and go to using a single massive blob file, or a few giant blobs.</p> <p>We have a storage abstraction that makes this way, way less painful than it should be.</p> <p>We can virtualize relfilenodes into storage extents in relatively few big files. We could use sparse regions to make the addressing more convenient, but that makes copying and backup painful, so I'd rather not.</p> <p>Even one file per tablespace for persistent relation heaps, another for indexes, another for each fork type.</p> </blockquote> <p>I'm not sure that we can do that now, since it would break the new &quot;Optimize btree insertions for common case of increasing values&quot; optimization. (I did mention this before it went in.)</p> <p>I've asked Pavan to at least add a note to the nbtree README that explains the high level theory behind the optimization, as part of post-commit clean-up. I'll ask him to say something about how it might affect extent-based storage, too.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 03:46:17 </code></pre> <blockquote> <p>On Apr 7, 2018, at 20:27, Craig Ringer wrote:</p> <p>Right now I think we're at option (4): If you see anything that smells like a write error in your kernel logs, hard-kill postgres with -m immediate (do NOT let it do a shutdown checkpoint). If it did a checkpoint since the logs, fake up a backup label to force redo to start from the last checkpoint before the error. Otherwise, it's safe to just let it start up again and do redo again.</p> </blockquote> <p>Before we spiral down into despair and excessive alcohol consumption, this is basically the same situation as a checksum failure or some other kind of uncorrected media-level error. The bad part is that we have to find out from the kernel logs rather than from PostgreSQL directly. But this does not strike me as otherwise significantly different from, say, an infrequently-accessed disk block reporting an uncorrectable error when we finally get around to reading it.</p> <hr /> <pre><code>From:Andreas Karlsson &lt;andreas(at)proxel(dot)se&gt; Date:2018-04-08 09:41:06 </code></pre> <p>On 04/08/2018 05:27 AM, Craig Ringer wrote:&gt;</p> <blockquote> <p>More below, but here's an idea #5: decide InnoDB has the right idea, and go to using a single massive blob file, or a few giant blobs.</p> </blockquote> <p>FYI: MySQL has by default one file per table these days. The old approach with one massive file was a maintenance headache so they change the default some releases ago.</p> <p><a href="https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.html">https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.html</a></p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-08 10:30:31 </code></pre> <p>On 8 April 2018 at 11:46, Christophe Pettus wrote:</p> <blockquote> <p>On Apr 7, 2018, at 20:27, Craig Ringer wrote:</p> <p>Right now I think we're at option (4): If you see anything that smells like a write error in your kernel logs, hard-kill postgres with -m immediate (do NOT let it do a shutdown checkpoint). If it did a checkpoint since the logs, fake up a backup label to force redo to start from the last checkpoint before the error. Otherwise, it's safe to just let it start up again and do redo again.</p> <p>Before we spiral down into despair and excessive alcohol consumption, this is basically the same situation as a checksum failure or some other kind of uncorrected media-level error. The bad part is that we have to find out from the kernel logs rather than from PostgreSQL directly. But this does not strike me as otherwise significantly different from, say, an infrequently-accessed disk block reporting an uncorrectable error when we finally get around to reading it.</p> </blockquote> <p>I don't entirely agree - because it affects ENOSPC, I/O errors on thin provisioned storage, I/O errors on multipath storage, etc. (I identified the original issue on a thin provisioned system that ran out of backing space, mangling PostgreSQL in a way that made no sense at the time).</p> <p>These are way more likely than bit flips or other storage level corruption, and things that we previously expected to detect and fail gracefully for.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-08 10:31:24 </code></pre> <p>On 8 April 2018 at 17:41, Andreas Karlsson wrote:</p> <blockquote> <p>On 04/08/2018 05:27 AM, Craig Ringer wrote:&gt; More below, but here's an idea #5: decide InnoDB has the right idea, and</p> <blockquote> <p>go to using a single massive blob file, or a few giant blobs.</p> </blockquote> <p>FYI: MySQL has by default one file per table these days. The old approach with one massive file was a maintenance headache so they change the default some releases ago.</p> <p><a href="https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.html">https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.html</a></p> </blockquote> <p>Huh, thanks for the update.</p> <p>We should see how they handle reliable flushing and see if they've looked into it. If they haven't, we should give them a heads-up and if they have, lets learn from them.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 16:38:03 </code></pre> <blockquote> <p>On Apr 8, 2018, at 03:30, Craig Ringer wrote:</p> <p>These are way more likely than bit flips or other storage level corruption, and things that we previously expected to detect and fail gracefully for.</p> </blockquote> <p>This is definitely bad, and it explains a few otherwise-inexplicable corruption issues we've seen. (And great work tracking it down!) I think it's important not to panic, though; PostgreSQL doesn't have a reputation for horrible data integrity. I'm not sure it makes sense to do a major rearchitecting of the storage layer (especially with pluggable storage coming along) to address this. While the failure modes are more common, the solution (a PITR backup) is one that an installation should have anyway against media failures.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-08 21:23:21 </code></pre> <p>On 8 April 2018 at 04:27, Craig Ringer wrote:</p> <blockquote> <p>On 8 April 2018 at 10:16, Thomas Munro wrote:</p> <p>If the kernel does writeback in the middle, how on earth is it supposed to know we expect to reopen the file and check back later?</p> <p>Should it just remember &quot;this file had an error&quot; forever, and tell every caller? In that case how could we recover? We'd need some new API to say &quot;yeah, ok already, I'm redoing all my work since the last good fsync() so you can clear the error flag now&quot;. Otherwise it'd keep reporting an error after we did redo to recover, too.</p> </blockquote> <p>There is no spoon^H^H^H^H^Herror flag. We don't need fsync to keep track of any errors. We just need fsync to accurately report whether all the buffers in the file have been written out. When you call fsync again the kernel needs to initiate i/o on all the dirty buffers and block until they complete successfully. If they complete successfully then nobody cares whether they had some failure in the past when i/o was initiated at some point in the past.</p> <p>The problem is not that errors aren't been tracked correctly. The problem is that dirty buffers are being marked clean when they haven't been written out. They consider dirty filesystem buffers when there's hardware failure preventing them from being written &quot;a memory leak&quot;.</p> <p>As long as any error means the kernel has discarded writes then there's no real hope of any reliable operation through that interface.</p> <p>Going to DIRECTIO is basically recognizing this. That the kernel filesystem buffer provides no reliable interface so we need to reimplement it ourselves in user space.</p> <p>It's rather disheartening. Aside from having to do all that work we have the added barrier that we don't have as much information about the hardware as the kernel has. We don't know where raid stripes begin and end, how big the memory controller buffers are or how to tell when they're full or empty or how to flush them. etc etc. We also don't know what else is going on on the machine.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 21:28:43 </code></pre> <blockquote> <p>On Apr 8, 2018, at 14:23, Greg Stark wrote:</p> <p>They consider dirty filesystem buffers when there's hardware failure preventing them from being written &quot;a memory leak&quot;.</p> </blockquote> <p>That's not an irrational position. File system buffers are <em>not</em> dedicated memory for file system caching; they're being used for that because no one has a better use for them at that moment. If an inability to flush them to disk meant that they suddenly became pinned memory, a large copy operation to a yanked USB drive could result in the system having no more allocatable memory. I guess in theory that they could swap them, but swapping out a file system buffer in hopes that sometime in the future it could be properly written doesn't seem very architecturally sound to me.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-08 21:47:04 </code></pre> <p>On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 8 April 2018 at 04:27, Craig Ringer wrote:</p> <blockquote> <p>On 8 April 2018 at 10:16, Thomas Munro wrote:</p> <p>If the kernel does writeback in the middle, how on earth is it supposed to know we expect to reopen the file and check back later?</p> <p>Should it just remember &quot;this file had an error&quot; forever, and tell every caller? In that case how could we recover? We'd need some new API to say &quot;yeah, ok already, I'm redoing all my work since the last good fsync() so you can clear the error flag now&quot;. Otherwise it'd keep reporting an error after we did redo to recover, too.</p> </blockquote> <p>There is no spoon^H^H^H^H^Herror flag. We don't need fsync to keep track of any errors. We just need fsync to accurately report whether all the buffers in the file have been written out. When you call fsync</p> </blockquote> <p>Instead, fsync() reports when some of the buffers have not been written out, due to reasons outlined before. As such it may make some sense to maintain some tracking regarding errors even after marking failed dirty pages as clean (in fact it has been proposed, but this introduces memory overhead).</p> <blockquote> <p>again the kernel needs to initiate i/o on all the dirty buffers and block until they complete successfully. If they complete successfully then nobody cares whether they had some failure in the past when i/o was initiated at some point in the past.</p> </blockquote> <p>The question is, what should the kernel and application do in cases where this is simply not possible (according to freebsd that keeps dirty pages around after failure, for example, -EIO from the block layer is a contract for unrecoverable errors so it is pointless to keep them dirty). You'd need a specialized interface to clear-out the errors (and drop the dirty pages), or potentially just remount the filesystem.</p> <blockquote> <p>The problem is not that errors aren't been tracked correctly. The problem is that dirty buffers are being marked clean when they haven't been written out. They consider dirty filesystem buffers when there's hardware failure preventing them from being written &quot;a memory leak&quot;.</p> <p>As long as any error means the kernel has discarded writes then there's no real hope of any reliable operation through that interface.</p> </blockquote> <p>This does not necessarily follow. Whether the kernel discards writes or not would not really help (see above). It is more a matter of proper &quot;reporting contract&quot; between userspace and kernel, and tracking would be a way for facilitating this vs. having a more complex userspace scheme (as described by others in this thread) where synchronization for fsync() is required in a multi-process application.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-08 22:29:16 </code></pre> <p>On Sun, Apr 8, 2018 at 09:38:03AM -0700, Christophe Pettus wrote:</p> <blockquote> <blockquote> <p>On Apr 8, 2018, at 03:30, Craig Ringer wrote:</p> <p>These are way more likely than bit flips or other storage level corruption, and things that we previously expected to detect and fail gracefully for.</p> </blockquote> <p>This is definitely bad, and it explains a few otherwise-inexplicable corruption issues we've seen. (And great work tracking it down!) I think it's important not to panic, though; PostgreSQL doesn't have a reputation for horrible data integrity. I'm not sure it makes sense to do a major rearchitecting of the storage layer (especially with pluggable storage coming along) to address this. While the failure modes are more common, the solution (a PITR backup) is one that an installation should have anyway against media failures.</p> </blockquote> <p>I think the big problem is that we don't have any way of stopping Postgres at the time the kernel reports the errors to the kernel log, so we are then returning potentially incorrect results and committing transactions that might be wrong or lost. If we could stop Postgres when such errors happen, at least the administrator could fix the problem of fail-over to a standby.</p> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 23:10:24 </code></pre> <blockquote> <p>On Apr 8, 2018, at 15:29, Bruce Momjian wrote: I think the big problem is that we don't have any way of stopping Postgres at the time the kernel reports the errors to the kernel log, so we are then returning potentially incorrect results and committing transactions that might be wrong or lost.</p> </blockquote> <p>Yeah, it's bad. In the short term, the best advice to installations is to monitor their kernel logs for errors (which very few do right now), and make sure they have a backup strategy which can encompass restoring from an error like this. Even Craig's smart fix of patching the backup label to recover from a previous checkpoint doesn't do much good if we don't have WAL records back that far (or one of the required WAL records also took a hit).</p> <p>In the longer term... O_DIRECT seems like the most plausible way out of this, but that might be popular with people running on file systems or OSes that don't have this issue. (Setting aside the daunting prospect of implementing that.)</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-08 23:16:25 </code></pre> <p>On 2018-04-08 18:29:16 -0400, Bruce Momjian wrote:</p> <blockquote> <p>On Sun, Apr 8, 2018 at 09:38:03AM -0700, Christophe Pettus wrote:</p> <blockquote> <blockquote> <p>On Apr 8, 2018, at 03:30, Craig Ringer wrote:</p> <p>These are way more likely than bit flips or other storage level corruption, and things that we previously expected to detect and fail gracefully for.</p> </blockquote> <p>This is definitely bad, and it explains a few otherwise-inexplicable corruption issues we've seen. (And great work tracking it down!) I think it's important not to panic, though; PostgreSQL doesn't have a reputation for horrible data integrity. I'm not sure it makes sense to do a major rearchitecting of the storage layer (especially with pluggable storage coming along) to address this. While the failure modes are more common, the solution (a PITR backup) is one that an installation should have anyway against media failures.</p> </blockquote> <p>I think the big problem is that we don't have any way of stopping Postgres at the time the kernel reports the errors to the kernel log, so we are then returning potentially incorrect results and committing transactions that might be wrong or lost. If we could stop Postgres when such errors happen, at least the administrator could fix the problem of fail-over to a standby.</p> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>I think the danger presented here is far smaller than some of the statements in this thread might make one think. In all likelihood, once you've got an IO error that kernel level retries don't fix, your database is screwed. Whether fsync reports that or not is really somewhat besides the point. We don't panic that way when getting IO errors during reads either, and they're more likely to be persistent than errors during writes (because remapping on storage layer can fix issues, but not during reads).</p> <p>There's a lot of not so great things here, but I don't think there's any need to panic.</p> <p>We should fix things so that reported errors are treated with crash recovery, and for the rest I think there's very fair arguments to be made that that's far outside postgres's remit.</p> <p>I think there's pretty good reasons to go to direct IO where supported, but error handling doesn't strike me as a particularly good reason for the move.</p> <hr /> <pre><code>From:Christophe Pettus &lt;xof(at)thebuild(dot)com&gt; Date:2018-04-08 23:27:57 </code></pre> <blockquote> <p>On Apr 8, 2018, at 16:16, Andres Freund wrote: We don't panic that way when getting IO errors during reads either, and they're more likely to be persistent than errors during writes (because remapping on storage layer can fix issues, but not during reads).</p> </blockquote> <p>There is a distinction to be drawn there, though, because we immediately pass an error back to the client on a read, but a write problem in this situation can be masked for an extended period of time.</p> <p>That being said...</p> <blockquote> <p>There's a lot of not so great things here, but I don't think there's any need to panic.</p> </blockquote> <p>No reason to panic, yes. We can assume that if this was a very big persistent problem, it would be much more widely reported. It would, however, be good to find a way to get the error surfaced back up to the client in a way that is not just monitoring the kernel logs.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-09 01:31:56 </code></pre> <p>On 9 April 2018 at 05:28, Christophe Pettus wrote:</p> <blockquote> <blockquote> <p>On Apr 8, 2018, at 14:23, Greg Stark wrote:</p> <p>They consider dirty filesystem buffers when there's hardware failure preventing them from being written &quot;a memory leak&quot;.</p> </blockquote> <p>That's not an irrational position. File system buffers are <em>not</em> dedicated memory for file system caching; they're being used for that because no one has a better use for them at that moment. If an inability to flush them to disk meant that they suddenly became pinned memory, a large copy operation to a yanked USB drive could result in the system having no more allocatable memory. I guess in theory that they could swap them, but swapping out a file system buffer in hopes that sometime in the future it could be properly written doesn't seem very architecturally sound to me.</p> </blockquote> <p>Yep.</p> <p>Another example is a write to an NFS or iSCSI volume that goes away forever. What if the app keeps write()ing in the hopes it'll come back, and by the time the kernel starts reporting EIO for write(), it's already saddled with a huge volume of dirty writeback buffers it can't get rid of because someone, one day, might want to know about them?</p> <p>You could make the argument that it's OK to forget if the entire file system goes away. But actually, why is that ok? What if it's remounted again? That'd be really bad too, for someone expecting write reliability.</p> <p>You can coarsen from dirty buffer tracking to marking the FD(s) bad, but what if there's no FD to mark because the file isn't open at the moment?</p> <p>You can mark the inode cache entry and pin it, I guess. But what if your app triggered I/O errors over vast numbers of small files? Again, the kernel's left holding the ball.</p> <p>It doesn't know if/when an app will return to check. It doesn't know how long to remember the failure for. It doesn't know when all interested clients have been informed and it can treat the fault as cleared/repaired, either, so it'd have to <em>keep on reporting EIO for PostgreSQL's own writes and fsyncs() indefinitely</em>, even once we do recovery.</p> <p>The only way it could avoid that would be to keep the dirty writeback pages around and flagged bad, then clear the flag when a new write() replaces the same file range. I can't imagine that being practical.</p> <p>Blaming the kernel for this sure is the easy way out.</p> <p>But IMO we cannot rationally expect the kernel to remember error state forever for us, then forget it when we expect, all without actually telling it anything about our activities or even that we still exist and are still interested in the files/writes. We've closed the files and gone away.</p> <p>Whatever we do, it's likely going to have to involve not doing that anymore.</p> <p>Even if we can somehow convince the kernel folks to add a new interface for us that reports I/O errors to some listener, like an inotify/fnotify/dnotify/whatever-it-is-today-notify extension reporting errors in buffered async writes, we won't be able to rely on having it for 5-10 years, and only on Linux.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-09 01:35:06 </code></pre> <p>On 9 April 2018 at 06:29, Bruce Momjian wrote:</p> <blockquote> <p>I think the big problem is that we don't have any way of stopping Postgres at the time the kernel reports the errors to the kernel log, so we are then returning potentially incorrect results and committing transactions that might be wrong or lost.</p> </blockquote> <p>Right.</p> <p>Specifically, we need a way to ask the kernel at checkpoint time &quot;was everything written to [this set of files] flushed successfully since the last time I asked, no matter who did the writing and no matter how the writes were flushed?&quot;</p> <p>If the result is &quot;no&quot; we PANIC and redo. If the hardware/volume is screwed, the user can fail over to a standby, do PITR, etc.</p> <p>But we don't have any way to ask that reliably at present.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 01:55:10 </code></pre> <p>Hi,</p> <p>On 2018-04-08 16:27:57 -0700, Christophe Pettus wrote:</p> <blockquote> <blockquote> <p>On Apr 8, 2018, at 16:16, Andres Freund wrote: We don't panic that way when getting IO errors during reads either, and they're more likely to be persistent than errors during writes (because remapping on storage layer can fix issues, but not during reads).</p> </blockquote> <p>There is a distinction to be drawn there, though, because we immediately pass an error back to the client on a read, but a write problem in this situation can be masked for an extended period of time.</p> </blockquote> <p>Only if you're &quot;lucky&quot; enough that your clients actually read that data, and then you're somehow able to figure out across the whole stack that these 0.001% of transactions that fail are due to IO errors. Or you also need to do log analysis.</p> <p>If you want to solve things like that you need regular reads of all your data, including verifications etc.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-09 02:00:41 </code></pre> <p>On 9 April 2018 at 07:16, Andres Freund wrote:</p> <blockquote> <p>I think the danger presented here is far smaller than some of the statements in this thread might make one think.</p> </blockquote> <p>Clearly it's not happening a huge amount or we'd have a lot of noise about Pg eating people's data, people shouting about how unreliable it is, etc. We don't. So it's not some earth shattering imminent threat to everyone's data. It's gone unnoticed, or the root cause unidentified, for a long time.</p> <p>I suspect we've written off a fair few issues in the past as &quot;it'd bad hardware&quot; when actually, the hardware fault was the trigger for a Pg/kernel interaction bug. And blamed containers for things that weren't really the container's fault. But even so, if it were happening tons, we'd hear more noise.</p> <p>I've already been very surprised there when I learned that PostgreSQL completely ignores wholly absent relfilenodes. Specifically, if you unlink() a relation's backing relfilenode while Pg is down and that file has writes pending in the WAL. We merrily re-create it with uninitalized pages and go on our way. As Andres pointed out in an offlist discussion, redo isn't a consistency check, and it's not obliged to fail in such cases. We can say &quot;well, don't do that then&quot; and define away file losses from FS corruption etc as not our problem, the lower levels we expect to take care of this have failed.</p> <p>We have to look at what checkpoints are and are not supposed to promise, and whether this is a problem we just define away as &quot;not our problem, the lower level failed, we're not obliged to detect this and fail gracefully.&quot;</p> <p>We can choose to say that checkpoints are required to guarantee crash/power loss safety ONLY and do not attempt to protect against I/O errors of any sort. In fact, I think we should likely amend the documentation for release versions to say just that.</p> <blockquote> <p>In all likelihood, once you've got an IO error that kernel level retries don't fix, your database is screwed.</p> </blockquote> <p>Your database is going to be down or have interrupted service. It's possible you may have some unreadable data. This could result in localised damage to one or more relations. That could affect FK relationships, indexes, all sorts. If you're really unlucky you might lose something critical like pg_clog/ contents.</p> <p>But in general your DB should be repairable/recoverable even in those cases.</p> <p>And in many failure modes there's no reason to expect any data loss at all, like:</p> <ul> <li>Local disk fills up (seems to be safe already due to space reservation at write() time)</li> <li>Thin-provisioned storage backing local volume iSCSI or paravirt block device fills up</li> <li>NFS volume fills up</li> <li>Multipath I/O error</li> <li>Interruption of connectivity to network block device</li> <li>Disk develops localized bad sector where we haven't previously written data</li> </ul> <p>Except for the ENOSPC on NFS, all the rest of the cases can be handled by expecting the kernel to retry forever and not return until the block is written or we reach the heat death of the universe. And NFS, well...</p> <p>Part of the trouble is that the kernel <em>won't</em> retry forever in all these cases, and doesn't seem to have a way to ask it to in all cases.</p> <p>And if the user hasn't configured it for the right behaviour in terms of I/O error resilience, we don't find out about it.</p> <p>So it's not the end of the world, but it'd sure be nice to fix.</p> <blockquote> <p>Whether fsync reports that or not is really somewhat besides the point. We don't panic that way when getting IO errors during reads either, and they're more likely to be persistent than errors during writes (because remapping on storage layer can fix issues, but not during reads).</p> </blockquote> <p>That's because reads don't make promises about what's committed and synced. I think that's quite different.</p> <blockquote> <p>We should fix things so that reported errors are treated with crash recovery, and for the rest I think there's very fair arguments to be made that that's far outside postgres's remit.</p> </blockquote> <p>Certainly for current versions.</p> <p>I think we need to think about a more robust path in future. But it's certainly not &quot;stop the world&quot; territory.</p> <p>The docs need an update to indicate that we explicitly disclaim responsibility for I/O errors on async writes, and that the kernel and I/O stack must be configured never to give up on buffered writes. If it does, that's not our problem anymore.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 02:06:12 </code></pre> <p>On 2018-04-09 10:00:41 +0800, Craig Ringer wrote:</p> <blockquote> <p>I suspect we've written off a fair few issues in the past as &quot;it'd bad hardware&quot; when actually, the hardware fault was the trigger for a Pg/kernel interaction bug. And blamed containers for things that weren't really the container's fault. But even so, if it were happening tons, we'd hear more noise.</p> </blockquote> <p>Agreed on that, but I think that's FAR more likely to be things like multixacts, index structure corruption due to logic bugs etc.</p> <blockquote> <p>I've already been very surprised there when I learned that PostgreSQL completely ignores wholly absent relfilenodes. Specifically, if you unlink() a relation's backing relfilenode while Pg is down and that file has writes pending in the WAL. We merrily re-create it with uninitalized pages and go on our way. As Andres pointed out in an offlist discussion, redo isn't a consistency check, and it's not obliged to fail in such cases. We can say &quot;well, don't do that then&quot; and define away file losses from FS corruption etc as not our problem, the lower levels we expect to take care of this have failed.</p> </blockquote> <p>And it'd be a realy bad idea to behave differently.</p> <blockquote> <p>And in many failure modes there's no reason to expect any data loss at all, like:</p> <ul> <li>Local disk fills up (seems to be safe already due to space reservation at write() time)</li> </ul> </blockquote> <p>That definitely should be treated separately.</p> <blockquote> <ul> <li>Thin-provisioned storage backing local volume iSCSI or paravirt block device fills up</li> <li>NFS volume fills up</li> </ul> </blockquote> <p>Those should be the same as the above.</p> <blockquote> <p>I think we need to think about a more robust path in future. But it's certainly not &quot;stop the world&quot; territory.</p> </blockquote> <p>I think you're underestimating the complexity of doing that by at least two orders of magnitude.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-09 03:15:01 </code></pre> <p>On 9 April 2018 at 10:06, Andres Freund wrote:</p> <blockquote> <blockquote> <p>And in many failure modes there's no reason to expect any data loss at all, like:</p> <ul> <li>Local disk fills up (seems to be safe already due to space reservation at write() time)</li> </ul> </blockquote> <p>That definitely should be treated separately.</p> </blockquote> <p>It is, because all the FSes I looked at reserve space before returning from write(), even if they do delayed allocation. So they won't fail with ENOSPC at fsync() time or silently due to lost errors on background writeback. Otherwise we'd be hearing a LOT more noise about this.</p> <blockquote> <blockquote> <ul> <li>Thin-provisioned storage backing local volume iSCSI or paravirt block device fills up</li> <li>NFS volume fills up</li> </ul> </blockquote> <p>Those should be the same as the above.</p> </blockquote> <p>Unfortunately, they aren't.</p> <p>AFAICS NFS doesn't reserve space with the other end before returning from write(), even if mounted with the sync option. So we can get ENOSPC lazily when the buffer writeback fails due to a full backing file system. This then travels the same paths as EIO: we fsync(), ERROR, retry, appear to succeed, and carry on with life losing the data. Or we never hear about the error in the first place.</p> <p>(There's a proposed extension that'd allow this, see <a href="https://tools.ietf.org/html/draft-iyer-nfsv4-space-reservation-ops-02#page-5">https://tools.ietf.org/html/draft-iyer-nfsv4-space-reservation-ops-02#page-5</a>, but I see no mention of it in fs/nfs. All the reserve_space / xdr_reserve_space stuff seems to be related to space in protocol messages at a quick read.)</p> <p>Thin provisioned storage could vary a fair bit depending on the implementation. But the specific failure case I saw, prompting this thread, was on a volume using the stack:</p> <pre><code>xfs -&gt; lvm2 -&gt; multipath -&gt; ??? -&gt; SAN </code></pre> <p>(the HBA/iSCSI/whatever was not recorded by the looks, but IIRC it was iSCSI. I'm checking.)</p> <p>The SAN ran out of space. Due to use of thin provisioning, Linux <em>thought</em> there was plenty of space on the volume; LVM thought it had plenty of physical extents free and unallocated, XFS thought there was tons of free space, etc. The space exhaustion manifested as I/O errors on flushes of writeback buffers.</p> <p>The logs were like this:</p> <pre><code>kernel: sd 2:0:0:1: [sdd] Unhandled sense code kernel: sd 2:0:0:1: [sdd] kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE kernel: sd 2:0:0:1: [sdd] kernel: Sense Key : Data Protect [current] kernel: sd 2:0:0:1: [sdd] kernel: Add. Sense: Space allocation failed write protect kernel: sd 2:0:0:1: [sdd] CDB: kernel: Write(16): **HEX-DATA-CUT-OUT** kernel: Buffer I/O error on device dm-0, logical block 3098338786 kernel: lost page write due to I/O error on dm-0 kernel: Buffer I/O error on device dm-0, logical block 3098338787 </code></pre> <p>The immediate cause was that Linux's multipath driver didn't seem to recognise the sense code as retryable, so it gave up and reported it to the next layer up (LVM). LVM and XFS both seem to think that the lower layer is responsible for retries, so they toss the write away, and tell any interested writers if they feel like it, per discussion upthread.</p> <p>In this case Pg did get the news and reported fsync() errors on checkpoints, but it only reported an error once per relfilenode. Once it ran out of failed relfilenodes to cause the checkpoint to ERROR, it &quot;completed&quot; a &quot;successful&quot; checkpoint and kept on running until the resulting corruption started to manifest its self and it segfaulted some time later. As we've now learned, there's no guarantee we'd even get the news about the I/O errors at all.</p> <p>WAL was on a separate volume that didn't run out of room immediately, so we didn't PANIC on WAL write failure and prevent the issue.</p> <p>In this case if Pg had PANIC'd (and been able to guarantee to get the news of write failures reliably), there'd have been no corruption and no data loss despite the underlying storage issue.</p> <p>If, prior to seeing this, you'd asked me &quot;will my PostgreSQL database be corrupted if my thin-provisioned volume runs out of space&quot; I'd have said &quot;Surely not. PostgreSQL won't be corrupted by running out of disk space, it orders writes carefully and forces flushes so that it will recover gracefully from write failures.&quot;</p> <p>Except not. I was very surprised.</p> <p>BTW, it also turns out that the <em>default</em> for multipath is to give up on errors anyway; see the queue_if_no_path option and no_path_retries options. (Hint: run PostgreSQL with no_path_retries=queue). That's a sane default if you use O_DIRECT|O_SYNC, and otherwise pretty much a data-eating setup.</p> <p>I regularly see rather a lot of multipath systems, iSCSI systems, SAN backed systems, etc. I think we need to be pretty clear that we expect them to retry indefinitely, and if they report an I/O error we cannot reliably handle it. We need to patch Pg to PANIC on any fsync() failure and document that Pg won't notice some storage failure modes that might otherwise be considered nonfatal or transient, so very specific storage configuration and testing is required. (Not that anyone will do it). Also warn against running on NFS even with &quot;hard,sync,nointr&quot;.</p> <p>It'd be interesting to have a tool that tested error handling, allowing people to do iSCSI plug-pull tests, that sort of thing. But as far as I can tell nobody ever tests their storage stack anyway, so I don't plan on writing something that'll never get used.</p> <blockquote> <blockquote> <p>I think we need to think about a more robust path in future. But it's certainly not &quot;stop the world&quot; territory.</p> </blockquote> <p>I think you're underestimating the complexity of doing that by at least two orders of magnitude.</p> </blockquote> <p>Oh, it's just a minor total rewrite of half Pg, no big deal ;)</p> <p>I'm sure that no matter how big I think it is, I'm still underestimating it.</p> <p>The most workable option IMO would be some sort of fnotify/dnotify/whatever that reports all I/O errors on a volume. Some kind of error reporting handle we can keep open on a volume level that we can check for each volume/tablespace after we fsync() everything to see if it all really worked. If we PANIC if that gives us a bad answer, and PANIC on fsync errors, we guard against the great majority of these sorts of should-be-transient-if-the-kernel-didn't-give-up-and-throw-away-our-data errors.</p> <p>Even then, good luck getting those events from an NFS volume in which the backing volume experiences an issue.</p> <p>And it's kind of moot because AFAICS no such interface exists.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-09 08:45:40 </code></pre> <p>On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 8 April 2018 at 04:27, Craig Ringer wrote:</p> <blockquote> <p>On 8 April 2018 at 10:16, Thomas Munro </p> </blockquote> </blockquote> <p>The question is, what should the kernel and application do in cases where this is simply not possible (according to freebsd that keeps dirty pages around after failure, for example, -EIO from the block layer is a contract for unrecoverable errors so it is pointless to keep them dirty). You'd need a specialized interface to clear-out the errors (and drop the dirty pages), or potentially just remount the filesystem.</p> </blockquote> <p>Well firstly that's not necessarily the question. ENOSPC is not an unrecoverable error. And even unrecoverable errors for a single write doesn't mean the write will never be able to succeed in the future. But secondly doesn't such an interface already exist? When the device is dropped any dirty pages already get dropped with it. What's the point in dropping them but keeping the failing device?</p> <p>But just to underline the point. &quot;pointless to keep them dirty&quot; is exactly backwards from the application's point of view. If the error writing to persistent media really is unrecoverable then it's all the more critical that the pages be kept so the data can be copied to some other device. The last thing user space expects to happen is if the data can't be written to persistent storage then also immediately delete it from RAM. (And the <em>really</em> last thing user space expects is for this to happen and return no error.)</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 10:50:41 </code></pre> <p>On Mon, Apr 09, 2018 at 09:45:40AM +0100, Greg Stark wrote:</p> <blockquote> <p>On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 8 April 2018 at 04:27, Craig Ringer wrote:</p> <blockquote> <p>On 8 April 2018 at 10:16, Thomas Munro </p> </blockquote> </blockquote> <p>The question is, what should the kernel and application do in cases where this is simply not possible (according to freebsd that keeps dirty pages around after failure, for example, -EIO from the block layer is a contract for unrecoverable errors so it is pointless to keep them dirty). You'd need a specialized interface to clear-out the errors (and drop the dirty pages), or potentially just remount the filesystem.</p> </blockquote> <p>Well firstly that's not necessarily the question. ENOSPC is not an unrecoverable error. And even unrecoverable errors for a single write doesn't mean the write will never be able to succeed in the future.</p> </blockquote> <p>To make things a bit simpler, let us focus on EIO for the moment. The contract between the block layer and the filesystem layer is assumed to be that of, when an EIO is propagated up to the fs, then you may assume that all possibilities for recovering have been exhausted in lower layers of the stack. Mind you, I am not claiming that this contract is either documented or necessarily respected (in fact there have been studies on the error propagation and handling of the block layer, see [1]). Let us assume that this is the design contract though (which appears to be the case across a number of open-source kernels), and if not - it's a bug. In this case, indeed the specific write()s will never be able to succeed in the future, at least not as long as the BIOs are allocated to the specific failing LBAs.</p> <blockquote> <p>But secondly doesn't such an interface already exist? When the device is dropped any dirty pages already get dropped with it. What's the point in dropping them but keeping the failing device?</p> </blockquote> <p>I think there are degrees of failure. There are certainly cases where one may encounter localized unrecoverable medium errors (specific to certain LBAs) that are non-maskable from the block layer and below. That does not mean that the device is dropped at all, so it does make sense to continue all other operations to all other regions of the device that are functional. In cases of total device failure, then the filesystem will prevent you from proceeding anyway.</p> <blockquote> <p>But just to underline the point. &quot;pointless to keep them dirty&quot; is exactly backwards from the application's point of view. If the error writing to persistent media really is unrecoverable then it's all the more critical that the pages be kept so the data can be copied to some other device. The last thing user space expects to happen is if the data can't be written to persistent storage then also immediately delete it from RAM. (And the <em>really</em> last thing user space expects is for this to happen and return no error.)</p> </blockquote> <p>Right. This implies though that apart from the kernel having to keep around the dirtied-but-unrecoverable pages for an unbounded time, that there's further an interface for obtaining the exact failed pages so that you can read them back. This in turn means that there needs to be an association between the fsync() caller and the specific dirtied pages that the caller intents to drain (for which we'd need an fsync_range(), among other things). BTW, currently the failed writebacks are not dropped from memory, but rather marked clean. They could be lost though due to memory pressure or due to explicit request (e.g. proc drop_caches), unless mlocked.</p> <p>There is a clear responsibility of the application to keep its buffers around until a successful fsync(). The kernels do report the error (albeit with all the complexities of dealing with the interface), at which point the application may not assume that the write()s where ever even buffered in the kernel page cache in the first place.</p> <p>What you seem to be asking for is the capability of dropping buffers over the (kernel) fence and idemnifying the application from any further responsibility, i.e. a hard assurance that either the kernel will persist the pages or it will keep them around till the application recovers them asynchronously, the filesystem is unmounted, or the system is rebooted.</p> <p>[1] <a href="https://www.usenix.org/legacy/event/fast08/tech/full_papers/gunawi/gunawi.pdf">https://www.usenix.org/legacy/event/fast08/tech/full_papers/gunawi/gunawi.pdf</a></p> <hr /> <pre><code>From:Geoff Winkless &lt;pgsqladmin(at)geoff(dot)dj&gt; Date:2018-04-09 12:03:28 </code></pre> <p>On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:</p> <blockquote> <p>What you seem to be asking for is the capability of dropping buffers over the (kernel) fence and idemnifying the application from any further responsibility, i.e. a hard assurance that either the kernel will persist the pages or it will keep them around till the application recovers them asynchronously, the filesystem is unmounted, or the system is rebooted.</p> </blockquote> <p>That seems like a perfectly reasonable position to take, frankly.</p> <p>The whole <em>point</em> of an Operating System should be that you can do exactly that. As a developer I should be able to call write() and fsync() and know that if both calls have succeeded then the result is on disk, no matter what another application has done in the meantime. If that's a &quot;difficult&quot; problem then that's the OS's problem, not mine. If the OS doesn't do that, it's _not_doing_its<em>job</em>.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-09 12:16:38 </code></pre> <p>On 9 April 2018 at 18:50, Anthony Iliopoulos wrote:</p> <blockquote> <p>There is a clear responsibility of the application to keep its buffers around until a successful fsync(). The kernels do report the error (albeit with all the complexities of dealing with the interface), at which point the application may not assume that the write()s where ever even buffered in the kernel page cache in the first place.</p> <p>What you seem to be asking for is the capability of dropping buffers over the (kernel) fence and idemnifying the application from any further responsibility, i.e. a hard assurance that either the kernel will persist the pages or it will keep them around till the application recovers them asynchronously, the filesystem is unmounted, or the system is rebooted.</p> </blockquote> <p>That's what Pg appears to assume now, yes.</p> <p>Whether that's reasonable is a whole different topic.</p> <p>I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. &quot;Tell us about anything that happens under pgdata/&quot; or an inotify-style per-directory-registration option. I'd even say that's ideal.</p> <p>In the mean time, I propose that we fsync() on close() before we age FDs out of the LRU on backends. Yes, that will hurt throughput and cause stalls, but we don't seem to have many better options. At least it'll only flush what we actually wrote to the OS buffers not what we may have in shared_buffers. If the bgwriter does the same thing, we should be 100% safe from this problem on 4.13+, and it'd be trivial to make it a GUC much like the fsync or full_page_writes options that people can turn off if they know the risks / know their storage is safe / don't care.</p> <p>Some keen person who wants to later could optimise it by adding a fsync worker thread pool in backends, so we don't block the main thread. Frankly that might be a nice thing to have in the checkpointer anyway. But it's out of scope for fixing this in durability terms.</p> <p>I'm partway through a patch that makes fsync panic on errors now. Once that's done, the next step will be to force fsync on close() in md and see how we go with that.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 12:31:27 </code></pre> <p>On Mon, Apr 09, 2018 at 01:03:28PM +0100, Geoff Winkless wrote:</p> <blockquote> <p>On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:</p> <blockquote> <p>What you seem to be asking for is the capability of dropping buffers over the (kernel) fence and idemnifying the application from any further responsibility, i.e. a hard assurance that either the kernel will persist the pages or it will keep them around till the application recovers them asynchronously, the filesystem is unmounted, or the system is rebooted.</p> </blockquote> <p>That seems like a perfectly reasonable position to take, frankly.</p> </blockquote> <p>Indeed, as long as you are willing to ignore the consequences of this design decision: mainly, how you would recover memory when no application is interested in clearing the error. At which point other applications with different priorities will find this position rather unreasonable since there can be no way out of it for them. Good luck convincing any OS kernel upstream to go with this design.</p> <blockquote> <p>The whole <em>point</em> of an Operating System should be that you can do exactly that. As a developer I should be able to call write() and fsync() and know that if both calls have succeeded then the result is on disk, no matter what another application has done in the meantime. If that's a &quot;difficult&quot; problem then that's the OS's problem, not mine. If the OS doesn't do that, it's _not_doing_its<em>job</em>.</p> </blockquote> <p>No OS kernel that I know of provides any promises for atomicity of a write()+fsync() sequence, unless one is using O_SYNC. It doesn't provide you with isolation either, as this is delegated to userspace, where processes that share a file should coordinate accordingly.</p> <p>It's not a difficult problem, but rather the kernels provide a common denominator of possible interfaces and designs that could accommodate a wider range of potential application scenarios for which the kernel cannot possibly anticipate requirements. There have been plenty of experimental works for providing a transactional (ACID) filesystem interface to applications. On the opposite end, there have been quite a few commercial databases that completely bypass the kernel storage stack. But I would assume it is reasonable to figure out something between those two extremes that can work in a &quot;portable&quot; fashion.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 12:54:16 </code></pre> <p>On Mon, Apr 09, 2018 at 08:16:38PM +0800, Craig Ringer wrote:</p> <blockquote> <p>I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. &quot;Tell us about anything that happens under pgdata/&quot; or an inotify-style per-directory-registration option. I'd even say that's ideal.</p> </blockquote> <p>I see what you are saying. So basically you'd always maintain the notification descriptor open, where the kernel would inject events related to writeback failures of files under watch (potentially enriched to contain info regarding the exact failed pages and the file offset they map to). The kernel wouldn't even have to maintain per-page bits to trace the errors, since they will be consumed by the process that reads the events (or discarded, when the notification fd is closed).</p> <p>Assuming this would be possible, wouldn't Pg still need to deal with synchronizing writers and related issues (since this would be merely a notification mechanism - not prevent any process from continuing), which I understand would be rather intrusive for the current Pg multi-process design.</p> <p>But other than that, similarly this interface could in principle be similarly implemented in the BSDs via kqueue(), I suppose, to provide what you need.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 13:33:18 </code></pre> <p>On 04/09/2018 02:31 PM, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 01:03:28PM +0100, Geoff Winkless wrote:</p> <blockquote> <p>On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:</p> <blockquote> <p>What you seem to be asking for is the capability of dropping buffers over the (kernel) fence and idemnifying the application from any further responsibility, i.e. a hard assurance that either the kernel will persist the pages or it will keep them around till the application recovers them asynchronously, the filesystem is unmounted, or the system is rebooted.</p> </blockquote> <p>That seems like a perfectly reasonable position to take, frankly.</p> </blockquote> <p>Indeed, as long as you are willing to ignore the consequences of this design decision: mainly, how you would recover memory when no application is interested in clearing the error. At which point other applications with different priorities will find this position rather unreasonable since there can be no way out of it for them.</p> </blockquote> <p>Sure, but the question is whether the system can reasonably operate after some of the writes failed and the data got lost. Because if it can't, then recovering the memory is rather useless. It might be better to stop the system in that case, forcing the system administrator to resolve the issue somehow (fail-over to a replica, perform recovery from the last checkpoint, ...).</p> <p>We already have dirty_bytes and dirty_background_bytes, for example. I don't see why there couldn't be another limit defining how much dirty data to allow before blocking writes altogether. I'm sure it's not that simple, but you get the general idea - do not allow using all available memory because of writeback issues, but don't throw the data away in case it's just a temporary issue.</p> <blockquote> <p>Good luck convincing any OS kernel upstream to go with this design.</p> </blockquote> <p>Well, there seem to be kernels that seem to do exactly that already. At least that's how I understand what this thread says about FreeBSD and Illumos, for example. So it's not an entirely insane design, apparently.</p> <p>The question is whether the current design makes it any easier for user-space developers to build reliable systems. We have tried using it, and unfortunately the answers seems to be &quot;no&quot; and &quot;Use direct I/O and manage everything on your own!&quot;</p> <blockquote> <blockquote> <p>The whole <em>point</em> of an Operating System should be that you can do exactly that. As a developer I should be able to call write() and fsync() and know that if both calls have succeeded then the result is on disk, no matter what another application has done in the meantime. If that's a &quot;difficult&quot; problem then that's the OS's problem, not mine. If the OS doesn't do that, it's _not_doing_its<em>job</em>.</p> </blockquote> <p>No OS kernel that I know of provides any promises for atomicity of a write()+fsync() sequence, unless one is using O_SYNC. It doesn't provide you with isolation either, as this is delegated to userspace, where processes that share a file should coordinate accordingly.</p> </blockquote> <p>We can (and do) take care of the atomicity and isolation. Implementation of those parts is obviously very application-specific, and we have WAL and locks for that purpose. I/O on the other hand seems to be a generic service provided by the OS - at least that's how we saw it until now.</p> <blockquote> <p>It's not a difficult problem, but rather the kernels provide a common denominator of possible interfaces and designs that could accommodate a wider range of potential application scenarios for which the kernel cannot possibly anticipate requirements. There have been plenty of experimental works for providing a transactional (ACID) filesystem interface to applications. On the opposite end, there have been quite a few commercial databases that completely bypass the kernel storage stack. But I would assume it is reasonable to figure out something between those two extremes that can work in a &quot;portable&quot; fashion.</p> </blockquote> <p>Users ask us about this quite often, actually. The question is usually about &quot;RAW devices&quot; and performance, but ultimately it boils down to buffered vs. direct I/O. So far our answer was we rely on kernel to do this reliably, because they know how to do that correctly and we simply don't have the manpower to implement it (portable, reliable, handling different types of storage, ...).</p> <p>One has to wonder how many applications actually use this correctly, considering PostgreSQL cares about data durability/consistency so much and yet we've been misunderstanding how it works for 20+ years.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 13:42:35 </code></pre> <p>On 04/09/2018 12:29 AM, Bruce Momjian wrote:</p> <blockquote> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> <hr /> <pre><code>From:Abhijit Menon-Sen &lt;ams(at)2ndQuadrant(dot)com&gt; Date:2018-04-09 13:47:03 </code></pre> <p>At 2018-04-09 15:42:35 +0200, tomas(dot)vondra(at)2ndquadrant(dot)com wrote:</p> <blockquote> <p>On 04/09/2018 12:29 AM, Bruce Momjian wrote:</p> <blockquote> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>That doesn't seem like a very practical way.</p> </blockquote> <p>Not least because Craig's tests showed that you can't rely on <em>always</em> getting an error message in the logs.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 13:54:19 </code></pre> <p>On 04/09/2018 04:00 AM, Craig Ringer wrote:</p> <blockquote> <p>On 9 April 2018 at 07:16, Andres Freund &lt;andres(at)anarazel(dot)de</p> <blockquote> <p>I think the danger presented here is far smaller than some of the statements in this thread might make one think.</p> </blockquote> <p>Clearly it's not happening a huge amount or we'd have a lot of noise about Pg eating people's data, people shouting about how unreliable it is, etc. We don't. So it's not some earth shattering imminent threat to everyone's data. It's gone unnoticed, or the root cause unidentified, for a long time.</p> </blockquote> <p>Yeah, it clearly isn't the case that everything we do suddenly got pointless. It's fairly annoying, though.</p> <blockquote> <p>I suspect we've written off a fair few issues in the past as &quot;it'd bad hardware&quot; when actually, the hardware fault was the trigger for a Pg/kernel interaction bug. And blamed containers for things that weren't really the container's fault. But even so, if it were happening tons, we'd hear more noise.</p> </blockquote> <p>Right. Write errors are fairly rare, and we've probably ignored a fair number of cases demonstrating this issue. It kinda reminds me the wisdom that not seeing planes with bullet holes in the engine does not mean engines don't need armor [1].</p> <p>[1] <a href="https://medium.com/@penguinpress/an-excerpt-from-how-not-to-be-wrong-by-jordan-ellenberg-664e708cfc3d">https://medium.com/@penguinpress/an-excerpt-from-how-not-to-be-wrong-by-jordan-ellenberg-664e708cfc3d</a></p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 14:22:06 </code></pre> <p>On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>We already have dirty_bytes and dirty_background_bytes, for example. I don't see why there couldn't be another limit defining how much dirty data to allow before blocking writes altogether. I'm sure it's not that simple, but you get the general idea - do not allow using all available memory because of writeback issues, but don't throw the data away in case it's just a temporary issue.</p> </blockquote> <p>Sure, there could be knobs for limiting how much memory such &quot;zombie&quot; pages may occupy. Not sure how helpful it would be in the long run since this tends to be highly application-specific, and for something with a large data footprint one would end up tuning this accordingly in a system-wide manner. This has the potential to leave other applications running in the same system with very little memory, in cases where for example original application crashes and never clears the error. Apart from that, further interfaces would need to be provided for actually dealing with the error (again assuming non-transient issues that may not be fixed transparently and that temporary issues are taken care of by lower layers of the stack).</p> <blockquote> <p>Well, there seem to be kernels that seem to do exactly that already. At least that's how I understand what this thread says about FreeBSD and Illumos, for example. So it's not an entirely insane design, apparently.</p> </blockquote> <p>It is reasonable, but even FreeBSD has a big fat comment right there (since 2017), mentioning that there can be no recovery from EIO at the block layer and this needs to be done differently. No idea how an application running on top of either FreeBSD or Illumos would actually recover from this error (and clear it out), other than remounting the fs in order to force dropping of relevant pages. It does provide though indeed a persistent error indication that would allow Pg to simply reliably panic. But again this does not necessarily play well with other applications that may be using the filesystem reliably at the same time, and are now faced with EIO while their own writes succeed to be persisted.</p> <p>Ideally, you'd want a (potentially persistent) indication of error localized to a file region (mapping the corresponding failed writeback pages). NetBSD is already implementing fsync_ranges(), which could be a step in the right direction.</p> <blockquote> <p>One has to wonder how many applications actually use this correctly, considering PostgreSQL cares about data durability/consistency so much and yet we've been misunderstanding how it works for 20+ years.</p> </blockquote> <p>I would expect it would be very few, potentially those that have a very simple process model (e.g. embedded DBs that can abort a txn on fsync() EIO). I think that durability is a rather complex cross-layer issue which has been grossly misunderstood similarly in the past (e.g. see [1]). It seems that both the OS and DB communities greatly benefit from a periodic reality check, and I see this as an opportunity for strengthening the IO stack in an end-to-end manner.</p> <p>[1] <a href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf">https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf</a></p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-09 15:29:36 </code></pre> <p>On 9 April 2018 at 15:22, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:</p> <p>Sure, there could be knobs for limiting how much memory such &quot;zombie&quot; pages may occupy. Not sure how helpful it would be in the long run since this tends to be highly application-specific, and for something with a large data footprint one would end up tuning this accordingly in a system-wide manner.</p> </blockquote> <p>Surely this is exactly what the kernel is there to manage. It has to control how much memory is allowed to be full of dirty buffers in the first place to ensure that the system won't get memory starved if it can't clean them fast enough. That isn't even about persistent hardware errors. Even when the hardware is working perfectly it can only flush buffers so fast. The whole point of the kernel is to abstract away shared resources. It's not like user space has any better view of the situation here. If Postgres implemented all this in DIRECT_IO it would have exactly the same problem only with less visibility into what the rest of the system is doing. If every application implemented its own buffer cache we would be back in the same boat only with a fragmented memory allocation.</p> <blockquote> <p>This has the potential to leave other applications running in the same system with very little memory, in cases where for example original application crashes and never clears the error.</p> </blockquote> <p>I still think we're speaking two different languages. There's no application anywhere that's going to &quot;clear the error&quot;. The application has done the writes and if it's calling fsync it wants to wait until the filesystem can arrange for the write to be persisted. If the application could manage without the persistence then it wouldn't have called fsync.</p> <p>The only way to &quot;clear out&quot; the error would be by having the writes succeed. There's no reason to think that wouldn't be possible sometime. The filesystem could remap blocks or an administrator could replace degraded raid device components. The only thing Postgres could do to recover would be create a new file and move the data (reading from the dirty buffer in memory!) to a new file anyways so we would &quot;clear the error&quot; by just no longer calling fsync on the old file.</p> <p>We always read fsync as a simple write barrier. That's what the documentation promised and it's what Postgres always expected. It sounds like the kernel implementors looked at it as some kind of communication channel to communicate status report for specific writes back to user-space. That's a much more complex problem and would have entirely different interface. I think this is why we're having so much difficulty communicating.</p> <blockquote> <p>It is reasonable, but even FreeBSD has a big fat comment right there (since 2017), mentioning that there can be no recovery from EIO at the block layer and this needs to be done differently. No idea how an application running on top of either FreeBSD or Illumos would actually recover from this error (and clear it out), other than remounting the fs in order to force dropping of relevant pages. It does provide though indeed a persistent error indication that would allow Pg to simply reliably panic. But again this does not necessarily play well with other applications that may be using the filesystem reliably at the same time, and are now faced with EIO while their own writes succeed to be persisted.</p> </blockquote> <p>Well if they're writing to the same file that had a previous error I doubt there are many applications that would be happy to consider their writes &quot;persisted&quot; when the file was corrupt. Ironically the earlier discussion quoted talked about how applications that wanted more granular communication would be using O_DIRECT -- but what we have is fsync trying to be <em>too</em> granular such that it's impossible to get any strong guarantees about anything with it.</p> <blockquote> <blockquote> <p>One has to wonder how many applications actually use this correctly, considering PostgreSQL cares about data durability/consistency so much and yet we've been misunderstanding how it works for 20+ years.</p> </blockquote> <p>I would expect it would be very few, potentially those that have a very simple process model (e.g. embedded DBs that can abort a txn on fsync() EIO).</p> </blockquote> <p>Honestly I don't think there's <em>any</em> way to use the current interface to implement reliable operation. Even that embedded database using a single process and keeping every file open all the time (which means file descriptor limits limit its scalability) can be having silent corruption whenever some other process like a backup program comes along and calls fsync (or even sync?).</p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-09 16:45:00 </code></pre> <p>On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote:</p> <blockquote> <p>In the mean time, I propose that we fsync() on close() before we age FDs out of the LRU on backends. Yes, that will hurt throughput and cause stalls, but we don't seem to have many better options. At least it'll only flush what we actually wrote to the OS buffers not what we may have in shared_buffers. If the bgwriter does the same thing, we should be 100% safe from this problem on 4.13+, and it'd be trivial to make it a GUC much like the fsync or full_page_writes options that people can turn off if they know the risks / know their storage is safe / don't care.</p> </blockquote> <p>Ouch. If a process exits -- say, because the user typed \q into psql -- then you're talking about potentially calling fsync() on a really large number of file descriptor flushing many gigabytes of data to disk. And it may well be that you never actually wrote any data to any of those file descriptors -- those writes could have come from other backends. Or you may have written a little bit of data through those FDs, but there could be lots of other data that you end up flushing incidentally. Perfectly innocuous things like starting up a backend, running a few short queries, and then having that backend exit suddenly turn into something that could have a massive system-wide performance impact.</p> <p>Also, if a backend ever manages to exit without running through this code, or writes any dirty blocks afterward, then this still fails to fix the problem completely. I guess that's probably avoidable -- we can put this late in the shutdown sequence and PANIC if it fails.</p> <p>I have a really tough time believing this is the right way to solve the problem. We suffered for years because of ext3's desire to flush the entire page cache whenever any single file was fsync()'d, which was terrible. Eventually ext4 became the norm, and the problem went away. Now we're going to deliberately insert logic to do a very similar kind of terrible thing because the kernel developers have decided that fsync() doesn't have to do what it says on the tin? I grant that there doesn't seem to be a better option, but I bet we're going to have a lot of really unhappy users if we do this.</p> <hr /> <pre><code>From:&quot;Joshua D(dot) Drake&quot; &lt;jd(at)commandprompt(dot)com&gt; Date:2018-04-09 17:26:24 </code></pre> <p>On 04/09/2018 09:45 AM, Robert Haas wrote:</p> <blockquote> <p>On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote:</p> <blockquote> <p>In the mean time, I propose that we fsync() on close() before we age FDs out of the LRU on backends. Yes, that will hurt throughput and cause stalls, but we don't seem to have many better options. At least it'll only flush what we actually wrote to the OS buffers not what we may have in shared_buffers. If the bgwriter does the same thing, we should be 100% safe from this problem on 4.13+, and it'd be trivial to make it a GUC much like the fsync or full_page_writes options that people can turn off if they know the risks / know their storage is safe / don't care.</p> </blockquote> <p>I have a really tough time believing this is the right way to solve the problem. We suffered for years because of ext3's desire to flush the entire page cache whenever any single file was fsync()'d, which was terrible. Eventually ext4 became the norm, and the problem went away. Now we're going to deliberately insert logic to do a very similar kind of terrible thing because the kernel developers have decided that fsync() doesn't have to do what it says on the tin? I grant that there doesn't seem to be a better option, but I bet we're going to have a lot of really unhappy users if we do this.</p> </blockquote> <p>I don't have a better option but whatever we do, it should be an optional (GUC) change. We have plenty of YEARS of people not noticing this issue and Robert's correct, if we go back to an era of things like stalls it is going to look bad on us no matter how we describe the problem.</p> <hr /> <pre><code>From:Gasper Zejn &lt;zejn(at)owca(dot)info&gt; Date:2018-04-09 18:02:21 </code></pre> <p>On 09. 04. 2018 15:42, Tomas Vondra wrote:</p> <blockquote> <p>On 04/09/2018 12:29 AM, Bruce Momjian wrote:</p> <blockquote> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> <p>regards</p> </blockquote> <p>For a bit less (or more) crazy idea, I'd imagine creating a Linux kernel module with kprobe/kretprobe capturing the file passed to fsync or even byte range within file and corresponding return value shouldn't be that hard. Kprobe has been a part of Linux kernel for a really long time, and from first glance it seems like it could be backported to 2.6 too.</p> <p>Then you could have stable log messages or implement some kind of &quot;fsync error log notification&quot; via whatever is the most sane way to get this out of kernel.</p> <p>If the kernel is new enough and has eBPF support (seems like &gt;=4.4), using bcc-tools[1] should enable you to write a quick script to get exactly that info via perf events[2].</p> <p>Obviously, that's a stopgap solution ...</p> <p>[1] <a href="https://github.com/iovisor/bcc">https://github.com/iovisor/bcc</a> [2] <a href="https://blog.yadutaf.fr/2016/03/30/turn-any-syscall-into-event-introducing-ebpf-kernel-probes/">https://blog.yadutaf.fr/2016/03/30/turn-any-syscall-into-event-introducing-ebpf-kernel-probes/</a></p> <hr /> <pre><code>From:Mark Dilger &lt;hornschnorter(at)gmail(dot)com&gt; Date:2018-04-09 18:29:42 </code></pre> <blockquote> <p>On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote:</p> <p>We have plenty of YEARS of people not noticing this issue</p> </blockquote> <p>I disagree. I have noticed this problem, but blamed it on other things. For over five years now, I have had to tell customers not to use thin provisioning, and I have had to add code to postgres to refuse to perform inserts or updates if the disk volume is more than 80% full. I have lost count of the number of customers who are running an older version of the product (because they refuse to upgrade) and come back with complaints that they ran out of disk and now their database is corrupt. All this time, I have been blaming this on virtualization and thin provisioning.</p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-09 19:02:11 </code></pre> <p>On Mon, Apr 9, 2018 at 12:45 PM, Robert Haas wrote:</p> <blockquote> <p>Ouch. If a process exits -- say, because the user typed \q into psql -- then you're talking about potentially calling fsync() on a really large number of file descriptor flushing many gigabytes of data to disk. And it may well be that you never actually wrote any data to any of those file descriptors -- those writes could have come from other backends. Or you may have written a little bit of data through those FDs, but there could be lots of other data that you end up flushing incidentally. Perfectly innocuous things like starting up a backend, running a few short queries, and then having that backend exit suddenly turn into something that could have a massive system-wide performance impact.</p> <p>Also, if a backend ever manages to exit without running through this code, or writes any dirty blocks afterward, then this still fails to fix the problem completely. I guess that's probably avoidable -- we can put this late in the shutdown sequence and PANIC if it fails.</p> <p>I have a really tough time believing this is the right way to solve the problem. We suffered for years because of ext3's desire to flush the entire page cache whenever any single file was fsync()'d, which was terrible. Eventually ext4 became the norm, and the problem went away. Now we're going to deliberately insert logic to do a very similar kind of terrible thing because the kernel developers have decided that fsync() doesn't have to do what it says on the tin? I grant that there doesn't seem to be a better option, but I bet we're going to have a lot of really unhappy users if we do this.</p> </blockquote> <p>What about the bug we fixed in <a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2ce439f3379aed857517c8ce207485655000fc8e">https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2ce439f3379aed857517c8ce207485655000fc8e</a> ? Say somebody does something along the lines of:</p> <pre><code>ps uxww | grep postgres | grep -v grep | awk '{print $2}' | xargs kill -9 </code></pre> <p>...and then restarts postgres. Craig's proposal wouldn't cover this case, because there was no opportunity to run fsync() after the first crash, and there's now no way to go back and fsync() any stuff we didn't fsync() before, because the kernel may have already thrown away the error state, or may lie to us and tell us everything is fine (because our new fd wasn't opened early enough). I can't find the original discussion that led to that commit right now, so I'm not exactly sure what scenarios we were thinking about. But I think it would at least be a problem if full_page_writes=off or if you had previously started the server with fsync=off and now wish to switch to fsync=on after completing a bulk load or similar. Recovery can read a page, see that it looks OK, and continue, and then a later fsync() failure can revert that page to an earlier state and now your database is corrupted -- and there's absolute no way to detect this because write() gives you the new page contents later, fsync() doesn't feel obliged to tell you about the error because your fd wasn't opened early enough, and eventually the write can be discarded and you'll revert back to the old page version with no errors ever being reported anywhere.</p> <p>Another consequence of this behavior that initdb -S is never reliable, so pg_rewind's use of it doesn't actually fix the problem it was intended to solve. It also means that initdb itself isn't crash-safe, since the data file changes are made by the backend but initdb itself is doing the fsyncs, and initdb has no way of knowing what files the backend is going to create and therefore can't -- even theoretically -- open them first.</p> <p>What's being presented to us as the API contract that we should expect from buffered I/O is that if you open a file and read() from it, call fsync(), and get no error, the kernel may nevertheless decide that some previous write that it never managed to flush can't be flushed, and then revert the page to the contents it had at some point in the past. That's mostly or less equivalent to letting a malicious adversary randomly overwrite database pages plausible-looking but incorrect contents without notice and hoping you can still build a reliable system. You can avoid the problem if you can always open an fd for every file you want to modify before it's written and hold on to it until after it's fsync'd, but that's pretty hard to guarantee in the face of kill -9.</p> <p>I think the simplest technological solution to this problem is to rewrite the entire backend and all supporting processes to use O_DIRECT everywhere. To maintain adequate performance, we'll have to write a complete I/O scheduling system inside PostgreSQL. Also, since we'll now have to make shared_buffers much larger -- since we'll no longer be benefiting from the OS cache -- we'll need to replace the use of malloc() with an allocator that pulls from shared_buffers. Plus, as noted, we'll need to totally rearchitect several of our critical frontend tools. Let's freeze all other development for the next year while we work on that, and put out a notice that Linux is no longer a supported platform for any existing release. Before we do that, we might want to check whether fsync() actually writes the data to disk in a usable way even with O_DIRECT. If not, we should just de-support Linux entirely as a hopelessly broken and unsupportable platform.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 19:13:14 </code></pre> <p>Hi,</p> <p>On 2018-04-09 15:02:11 -0400, Robert Haas wrote:</p> <blockquote> <p>I think the simplest technological solution to this problem is to rewrite the entire backend and all supporting processes to use O_DIRECT everywhere. To maintain adequate performance, we'll have to write a complete I/O scheduling system inside PostgreSQL. Also, since we'll now have to make shared_buffers much larger -- since we'll no longer be benefiting from the OS cache -- we'll need to replace the use of malloc() with an allocator that pulls from shared_buffers. Plus, as noted, we'll need to totally rearchitect several of our critical frontend tools. Let's freeze all other development for the next year while we work on that, and put out a notice that Linux is no longer a supported platform for any existing release. Before we do that, we might want to check whether fsync() actually writes the data to disk in a usable way even with O_DIRECT. If not, we should just de-support Linux entirely as a hopelessly broken and unsupportable platform.</p> </blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 19:22:58 </code></pre> <p>On 04/09/2018 08:29 PM, Mark Dilger wrote:</p> <blockquote> <blockquote> <p>On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote: We have plenty of YEARS of people not noticing this issue</p> </blockquote> <p>I disagree. I have noticed this problem, but blamed it on other things. For over five years now, I have had to tell customers not to use thin provisioning, and I have had to add code to postgres to refuse to perform inserts or updates if the disk volume is more than 80% full. I have lost count of the number of customers who are running an older version of the product (because they refuse to upgrade) and come back with complaints that they ran out of disk and now their database is corrupt. All this time, I have been blaming this on virtualization and thin provisioning.</p> </blockquote> <p>Yeah. There's a big difference between not noticing an issue because it does not happen very often vs. attributing it to something else. If we had the ability to revisit past data corruption cases, we would probably discover a fair number of cases caused by this.</p> <p>The other thing we probably need to acknowledge is that the environment changes significantly - things like thin provisioning are likely to get even more common, increasing the incidence of these issues.</p> <hr /> <pre><code>From:Peter Geoghegan &lt;pg(at)bowt(dot)ie&gt; Date:2018-04-09 19:25:33 </code></pre> <p>On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund wrote:</p> <blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> </blockquote> <p>+1</p> <blockquote> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>Right. We seem to be implicitly assuming that there is a big difference between a problem in the storage layer that we could in principle detect, but don't, and any other problem in the storage layer. I've read articles claiming that technologies like SMART are not really reliable in a practical sense [1], so it seems to me that there is reason to doubt that this gap is all that big.</p> <p>That said, I suspect that the problems with running out of disk space are serious practical problems. I have personally scoffed at stories involving Postgres databases corruption that gets attributed to running out of disk space. Looks like I was dead wrong.</p> <p>[1] <a href="https://danluu.com/file-consistency/">https://danluu.com/file-consistency/</a> -- &quot;Filesystem correctness&quot;</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 19:26:21 </code></pre> <p>On Mon, Apr 09, 2018 at 04:29:36PM +0100, Greg Stark wrote:</p> <blockquote> <p>Honestly I don't think there's <em>any</em> way to use the current interface to implement reliable operation. Even that embedded database using a single process and keeping every file open all the time (which means file descriptor limits limit its scalability) can be having silent corruption whenever some other process like a backup program comes along and calls fsync (or even sync?).</p> </blockquote> <p>That is indeed true (sync would induce fsync on open inodes and clear the error), and that's a nasty bug that apparently went unnoticed for a very long time. Hopefully the errseq_t linux 4.13 fixes deal with at least this issue, but similar fixes need to be adopted by many other kernels (all those that mark failed pages as clean).</p> <p>I honestly do not expect that keeping around the failed pages will be an acceptable change for most kernels, and as such the recommendation will probably be to coordinate in userspace for the fsync().</p> <p>What about having buffered IO with implied fsync() atomicity via O_SYNC? This would probably necessitate some helper threads that mask the latency and present an async interface to the rest of PG, but sounds less intrusive than going for DIO.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 19:29:16 </code></pre> <p>On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>What about having buffered IO with implied fsync() atomicity via O_SYNC?</p> </blockquote> <p>You're kidding, right? We could also just add sleep(30)'s all over the tree, and hope that that'll solve the problem. There's a reason we don't permanently fsync everything. Namely that it'll be way too slow.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 19:37:03 </code></pre> <p>On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:</p> <blockquote> <p>I honestly do not expect that keeping around the failed pages will be an acceptable change for most kernels, and as such the recommendation will probably be to coordinate in userspace for the fsync().</p> </blockquote> <p>Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.</p> <hr /> <pre><code>From:Justin Pryzby &lt;pryzby(at)telsasoft(dot)com&gt; Date:2018-04-09 19:41:19 </code></pre> <p>On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:</p> <blockquote> <p>You could make the argument that it's OK to forget if the entire file system goes away. But actually, why is that ok?</p> </blockquote> <p>I was going to say that it'd be okay to clear error flag on umount, since any opened files would prevent unmounting; but, then I realized we need to consider the case of close()ing all FDs then opening them later..in another process.</p> <p>I was going to say that's fine for postgres, since it chdir()s into its basedir, but actually not fine for nondefault tablespaces..</p> <p>On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>notification descriptor open, where the kernel would inject events related to writeback failures of files under watch (potentially enriched to contain info regarding the exact failed pages and the file offset they map to).</p> </blockquote> <p>For postgres that'd require backend processes to open() an file such that, following its close(), any writeback errors are &quot;signalled&quot; to the checkpointer process...</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 19:44:31 </code></pre> <p>On Mon, Apr 09, 2018 at 12:29:16PM -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>What about having buffered IO with implied fsync() atomicity via O_SYNC?</p> </blockquote> <p>You're kidding, right? We could also just add sleep(30)'s all over the tree, and hope that that'll solve the problem. There's a reason we don't permanently fsync everything. Namely that it'll be way too slow.</p> </blockquote> <p>I am assuming you can apply the same principle of selectively using O_SYNC at times and places that you'd currently actually call fsync().</p> <p>Also assuming that you'd want to have a backwards-compatible solution for all those kernels that don't keep the pages around, irrespective of future fixes. Short of loading a kernel module and dealing with the problem directly, the only other available options seem to be either O_SYNC, O_DIRECT or ignoring the issue.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 19:47:44 </code></pre> <p>On 04/09/2018 04:22 PM, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>We already have dirty_bytes and dirty_background_bytes, for example. I don't see why there couldn't be another limit defining how much dirty data to allow before blocking writes altogether. I'm sure it's not that simple, but you get the general idea - do not allow using all available memory because of writeback issues, but don't throw the data away in case it's just a temporary issue.</p> </blockquote> <p>Sure, there could be knobs for limiting how much memory such &quot;zombie&quot; pages may occupy. Not sure how helpful it would be in the long run since this tends to be highly application-specific, and for something with a large data footprint one would end up tuning this accordingly in a system-wide manner. This has the potential to leave other applications running in the same system with very little memory, in cases where for example original application crashes and never clears the error. Apart from that, further interfaces would need to be provided for actually dealing with the error (again assuming non-transient issues that may not be fixed transparently and that temporary issues are taken care of by lower layers of the stack).</p> </blockquote> <p>I don't quite see how this is any different from other possible issues when running multiple applications on the same system. One application can generate a lot of dirty data, reaching dirty_bytes and forcing the other applications on the same host to do synchronous writes.</p> <p>Of course, you might argue that is a temporary condition - it will resolve itself once the dirty pages get written to storage. In case of an I/O issue, it is a permanent impact - it will not resolve itself unless the I/O problem gets fixed.</p> <p>Not sure what interfaces would need to be written? Possibly something that says &quot;drop dirty pages for these files&quot; after the application gets killed or something. That makes sense, of course.</p> <blockquote> <blockquote> <p>Well, there seem to be kernels that seem to do exactly that already. At least that's how I understand what this thread says about FreeBSD and Illumos, for example. So it's not an entirely insane design, apparently.</p> </blockquote> <p>It is reasonable, but even FreeBSD has a big fat comment right there (since 2017), mentioning that there can be no recovery from EIO at the block layer and this needs to be done differently. No idea how an application running on top of either FreeBSD or Illumos would actually recover from this error (and clear it out), other than remounting the fs in order to force dropping of relevant pages. It does provide though indeed a persistent error indication that would allow Pg to simply reliably panic. But again this does not necessarily play well with other applications that may be using the filesystem reliably at the same time, and are now faced with EIO while their own writes succeed to be persisted.</p> </blockquote> <p>In my experience when you have a persistent I/O error on a device, it likely affects all applications using that device. So unmounting the fs to clear the dirty pages seems like an acceptable solution to me.</p> <p>I don't see what else the application should do? In a way I'm suggesting applications don't really want to be responsible for recovering (cleanup or dirty pages etc.). We're more than happy to hand that over to kernel, e.g. because each kernel will do that differently. What we however do want is reliable information about fsync outcome, which we need to properly manage WAL, checkpoints etc.</p> <blockquote> <p>Ideally, you'd want a (potentially persistent) indication of error localized to a file region (mapping the corresponding failed writeback pages). NetBSD is already implementing fsync_ranges(), which could be a step in the right direction.</p> <blockquote> <p>One has to wonder how many applications actually use this correctly, considering PostgreSQL cares about data durability/consistency so much and yet we've been misunderstanding how it works for 20+ years.</p> </blockquote> <p>I would expect it would be very few, potentially those that have a very simple process model (e.g. embedded DBs that can abort a txn on fsync() EIO). I think that durability is a rather complex cross-layer issue which has been grossly misunderstood similarly in the past (e.g. see [1]). It seems that both the OS and DB communities greatly benefit from a periodic reality check, and I see this as an opportunity for strengthening the IO stack in an end-to-end manner.</p> </blockquote> <p>Right. What I was getting to is that perhaps the current fsync() behavior is not very practical for building actual applications.</p> <blockquote> <p>Best regards, Anthony</p> <p>[1] <a href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf">https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf</a></p> </blockquote> <p>Thanks. The paper looks interesting.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-09 19:51:12 </code></pre> <p>On Mon, Apr 09, 2018 at 12:37:03PM -0700, Andres Freund wrote:</p> <blockquote> <p>On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:</p> <blockquote> <p>I honestly do not expect that keeping around the failed pages will be an acceptable change for most kernels, and as such the recommendation will probably be to coordinate in userspace for the fsync().</p> </blockquote> <p>Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.</p> </blockquote> <p>As discussed before, I think this could be acceptable, especially if you pair it with an opt-in mechanism (only applications that care to deal with this will have to), and would give it a shot.</p> <p>Still need a way to deal with all other systems and prior kernel releases that are eating fsync() writeback errors even over sync().</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 19:54:05 </code></pre> <p>On 04/09/2018 09:37 PM, Andres Freund wrote:</p> <blockquote> <p>On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:</p> <blockquote> <p>I honestly do not expect that keeping around the failed pages will be an acceptable change for most kernels, and as such the recommendation will probably be to coordinate in userspace for the fsync().</p> </blockquote> <p>Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.</p> </blockquote> <p>Isn't the expectation that when a fsync call fails, the next one will retry writing the pages in the hope that it succeeds?</p> <p>Of course, it's also possible to do what you suggested, and simply mark the inode as failed. In which case the next fsync can't possibly retry the writes (e.g. after freeing some space on thin-provisioned system), but we'd get reliable failure mode.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 19:59:34 </code></pre> <p>On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:</p> <blockquote> <p>You could make the argument that it's OK to forget if the entire file system goes away. But actually, why is that ok?</p> </blockquote> <p>I was going to say that it'd be okay to clear error flag on umount, since any opened files would prevent unmounting; but, then I realized we need to consider the case of close()ing all FDs then opening them later..in another process.</p> <p>On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>notification descriptor open, where the kernel would inject events related to writeback failures of files under watch (potentially enriched to contain info regarding the exact failed pages and the file offset they map to).</p> </blockquote> <p>For postgres that'd require backend processes to open() an file such that, following its close(), any writeback errors are &quot;signalled&quot; to the checkpointer process...</p> </blockquote> <p>I don't think that's as hard as some people argued in this thread. We could very well open a pipe in postmaster with the write end open in each subprocess, and the read end open only in checkpointer (and postmaster, but unused there). Whenever closing a file descriptor that was dirtied in the current process, send it over the pipe to the checkpointer. The checkpointer then can receive all those file descriptors (making sure it's not above the limit, fsync(), close() ing to make room if necessary). The biggest complication would presumably be to deduplicate the received filedescriptors for the same file, without loosing track of any errors.</p> <p>Even better, we could do so via a dedicated worker. That'd quite possibly end up as a performance benefit.</p> <blockquote> <p>I was going to say that's fine for postgres, since it chdir()s into its basedir, but actually not fine for nondefault tablespaces..</p> </blockquote> <p>I think it'd be fair to open PG_VERSION of all created tablespaces. Would require some hangups to signal checkpointer (or whichever process) to do so when creating one, but it shouldn't be too hard. Some people would complain because they can't do some nasty hacks anymore, but it'd also save peoples butts by preventing them from accidentally unmounting.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 20:04:20 </code></pre> <p>Hi,</p> <p>On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:</p> <blockquote> <p>Isn't the expectation that when a fsync call fails, the next one will retry writing the pages in the hope that it succeeds?</p> </blockquote> <p>Some people expect that, I personally don't think it's a useful expectation.</p> <p>We should just deal with this by crash-recovery. The big problem I see is that you always need to keep an file descriptor open for pretty much any file written to inside and outside of postgres, to be guaranteed to see errors. And that'd solve that. Even if retrying would work, I'd advocate for that (I've done so in the past, and I've written code in pg that panics on fsync failure...).</p> <p>What we'd need to do however is to clear that bit during crash recovery... Which is interesting from a policy perspective. Could be that other apps wouldn't want that.</p> <p>I also wonder if we couldn't just somewhere read each relevant mounted filesystem's errseq value. Whenever checkpointer notices before finishing a checkpoint that it has changed, do a crash restart.</p> <hr /> <pre><code>From:Mark Dilger &lt;hornschnorter(at)gmail(dot)com&gt; Date:2018-04-09 20:25:54 </code></pre> <blockquote> <p>On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:</p> <p>Hi,</p> <p>On 2018-04-09 15:02:11 -0400, Robert Haas wrote:</p> <blockquote> <p>I think the simplest technological solution to this problem is to rewrite the entire backend and all supporting processes to use O_DIRECT everywhere. To maintain adequate performance, we'll have to write a complete I/O scheduling system inside PostgreSQL. Also, since we'll now have to make shared_buffers much larger -- since we'll no longer be benefiting from the OS cache -- we'll need to replace the use of malloc() with an allocator that pulls from shared_buffers. Plus, as noted, we'll need to totally rearchitect several of our critical frontend tools. Let's freeze all other development for the next year while we work on that, and put out a notice that Linux is no longer a supported platform for any existing release. Before we do that, we might want to check whether fsync() actually writes the data to disk in a usable way even with O_DIRECT. If not, we should just de-support Linux entirely as a hopelessly broken and unsupportable platform.</p> </blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>I was reading this thread up until now as meaning that the standby could receive corrupt WAL data and become corrupted. That seems a much bigger problem than merely having the master become corrupted in some unrecoverable way. It is a long standing expectation that serious hardware problems on the master can result in the master needing to be replaced. But there has not been an expectation that the one or more standby servers would be taken down along with the master, leaving all copies of the database unusable. If this bug corrupts the standby servers, too, then it is a whole different class of problem than the one folks have come to expect.</p> <p>Your comment reads as if this is a problem isolated to whichever server has the problem, and will not get propagated to other servers. Am I reading that right?</p> <p>Can anybody clarify this for non-core-hacker folks following along at home?</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 20:30:00 </code></pre> <p>On 04/09/2018 10:04 PM, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:</p> <blockquote> <p>Isn't the expectation that when a fsync call fails, the next one will retry writing the pages in the hope that it succeeds?</p> </blockquote> <p>Some people expect that, I personally don't think it's a useful expectation.</p> </blockquote> <p>Maybe. I'd certainly prefer automated recovery from an temporary I/O issues (like full disk on thin-provisioning) without the database crashing and restarting. But I'm not sure it's worth the effort.</p> <p>And most importantly, it's rather delusional to think the kernel developers are going to be enthusiastic about that approach ...</p> <blockquote> <p>We should just deal with this by crash-recovery. The big problem I see is that you always need to keep an file descriptor open for pretty much any file written to inside and outside of postgres, to be guaranteed to see errors. And that'd solve that. Even if retrying would work, I'd advocate for that (I've done so in the past, and I've written code in pg that panics on fsync failure...).</p> </blockquote> <p>Sure. And it's likely way less invasive from kernel perspective.</p> <blockquote> <p>What we'd need to do however is to clear that bit during crash recovery... Which is interesting from a policy perspective. Could be that other apps wouldn't want that.</p> </blockquote> <p>IMHO it'd be enough if a remount clears it.</p> <blockquote> <p>I also wonder if we couldn't just somewhere read each relevant mounted filesystem's errseq value. Whenever checkpointer notices before finishing a checkpoint that it has changed, do a crash restart.</p> </blockquote> <p>Hmmmm, that's an interesting idea, and it's about the only thing that would help us on older kernels. There's a wb_err in adress_space, but that's at inode level. Not sure if there's something at fs level.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 20:34:15 </code></pre> <p>Hi,</p> <p>On 2018-04-09 13:25:54 -0700, Mark Dilger wrote:</p> <blockquote> <p>I was reading this thread up until now as meaning that the standby could receive corrupt WAL data and become corrupted.</p> </blockquote> <p>I don't see that as a real problem here. For one the problematic scenarios shouldn't readily apply, for another WAL is checksummed.</p> <p>There's the problem that a new basebackup would potentially become corrupted however. And similarly pg_rewind.</p> <p>Note that I'm not saying that we and/or linux shouldn't change anything. Just that the apocalypse isn't here.</p> <blockquote> <p>Your comment reads as if this is a problem isolated to whichever server has the problem, and will not get propagated to other servers. Am I reading that right?</p> </blockquote> <p>I think that's basically right. There's cases where corruption could get propagated, but they're not straightforward.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 20:37:31 </code></pre> <p>Hi,</p> <p>On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:</p> <blockquote> <p>Maybe. I'd certainly prefer automated recovery from an temporary I/O issues (like full disk on thin-provisioning) without the database crashing and restarting. But I'm not sure it's worth the effort.</p> </blockquote> <p>Oh, I agree on that one. But that's more a question of how we force the kernel's hand on allocating disk space. In most cases the kernel allocates the disk space immediately, even if delayed allocation is in effect. For the cases where that's not the case (if there are current ones, rather than just past bugs), we should be able to make sure that's not an issue by pre-zeroing the data and/or using fallocate.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 20:43:03 </code></pre> <p>On 04/09/2018 10:25 PM, Mark Dilger wrote:</p> <blockquote> <blockquote> <p>On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:</p> <p>Hi,</p> <p>On 2018-04-09 15:02:11 -0400, Robert Haas wrote:</p> <blockquote> <p>I think the simplest technological solution to this problem is to rewrite the entire backend and all supporting processes to use O_DIRECT everywhere. To maintain adequate performance, we'll have to write a complete I/O scheduling system inside PostgreSQL. Also, since we'll now have to make shared_buffers much larger -- since we'll no longer be benefiting from the OS cache -- we'll need to replace the use of malloc() with an allocator that pulls from shared_buffers. Plus, as noted, we'll need to totally rearchitect several of our critical frontend tools. Let's freeze all other development for the next year while we work on that, and put out a notice that Linux is no longer a supported platform for any existing release. Before we do that, we might want to check whether fsync() actually writes the data to disk in a usable way even with O_DIRECT. If not, we should just de-support Linux entirely as a hopelessly broken and unsupportable platform.</p> </blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>I was reading this thread up until now as meaning that the standby could receive corrupt WAL data and become corrupted. That seems a much bigger problem than merely having the master become corrupted in some unrecoverable way. It is a long standing expectation that serious hardware problems on the master can result in the master needing to be replaced. But there has not been an expectation that the one or more standby servers would be taken down along with the master, leaving all copies of the database unusable. If this bug corrupts the standby servers, too, then it is a whole different class of problem than the one folks have come to expect.</p> <p>Your comment reads as if this is a problem isolated to whichever server has the problem, and will not get propagated to other servers. Am I reading that right?</p> <p>Can anybody clarify this for non-core-hacker folks following along at home?</p> </blockquote> <p>That's a good question. I don't see any guarantee it'd be isolated to the master node. Consider this example:</p> <p>(0) checkpoint happens on the primary</p> <p>(1) a page gets modified, a full-page gets written to WAL</p> <p>(2) the page is written out to page cache</p> <p>(3) writeback of that page fails (and gets discarded)</p> <p>(4) we attempt to modify the page again, but we read the stale version</p> <p>(5) we modify the stale version, writing the change to WAL</p> <p>The standby will get the full-page, and then a WAL from the stale page version. That doesn't seem like a story with a happy end, I guess. But I might be easily missing some protection built into the WAL ...</p> <hr /> <pre><code>From:Mark Dilger &lt;hornschnorter(at)gmail(dot)com&gt; Date:2018-04-09 20:55:29 </code></pre> <blockquote> <p>On Apr 9, 2018, at 1:43 PM, Tomas Vondra wrote:</p> <p>On 04/09/2018 10:25 PM, Mark Dilger wrote:</p> <blockquote> <blockquote> <p>On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:</p> <p>Hi,</p> <p>On 2018-04-09 15:02:11 -0400, Robert Haas wrote:</p> <blockquote> <p>I think the simplest technological solution to this problem is to rewrite the entire backend and all supporting processes to use O_DIRECT everywhere. To maintain adequate performance, we'll have to write a complete I/O scheduling system inside PostgreSQL. Also, since we'll now have to make shared_buffers much larger -- since we'll no longer be benefiting from the OS cache -- we'll need to replace the use of malloc() with an allocator that pulls from shared_buffers. Plus, as noted, we'll need to totally rearchitect several of our critical frontend tools. Let's freeze all other development for the next year while we work on that, and put out a notice that Linux is no longer a supported platform for any existing release. Before we do that, we might want to check whether fsync() actually writes the data to disk in a usable way even with O_DIRECT. If not, we should just de-support Linux entirely as a hopelessly broken and unsupportable platform.</p> </blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>I was reading this thread up until now as meaning that the standby could receive corrupt WAL data and become corrupted. That seems a much bigger problem than merely having the master become corrupted in some unrecoverable way. It is a long standing expectation that serious hardware problems on the master can result in the master needing to be replaced. But there has not been an expectation that the one or more standby servers would be taken down along with the master, leaving all copies of the database unusable. If this bug corrupts the standby servers, too, then it is a whole different class of problem than the one folks have come to expect.</p> <p>Your comment reads as if this is a problem isolated to whichever server has the problem, and will not get propagated to other servers. Am I reading that right?</p> <p>Can anybody clarify this for non-core-hacker folks following along at home?</p> </blockquote> <p>That's a good question. I don't see any guarantee it'd be isolated to the master node. Consider this example:</p> <p>(0) checkpoint happens on the primary</p> <p>(1) a page gets modified, a full-page gets written to WAL</p> <p>(2) the page is written out to page cache</p> <p>(3) writeback of that page fails (and gets discarded)</p> <p>(4) we attempt to modify the page again, but we read the stale version</p> <p>(5) we modify the stale version, writing the change to WAL</p> <p>The standby will get the full-page, and then a WAL from the stale page version. That doesn't seem like a story with a happy end, I guess. But I might be easily missing some protection built into the WAL ...</p> </blockquote> <p>I can also imagine a master and standby that are similarly provisioned, and thus hit an out of disk error at around the same time, resulting in corruption on both, even if not the same corruption. When choosing to have one standby, or two standbys, or ten standbys, one needs to be able to assume a certain amount of statistical independence between failures on one server and failures on another. If they are tightly correlated dependent variables, then the conclusion that the probability of all nodes failing simultaneously is vanishingly small becomes invalid.</p> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-09 21:08:29 </code></pre> <p>Hi,</p> <p>On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:</p> <blockquote> <p>I can also imagine a master and standby that are similarly provisioned, and thus hit an out of disk error at around the same time, resulting in corruption on both, even if not the same corruption.</p> </blockquote> <p>I think it's a grave mistake conflating ENOSPC issues (which we should solve by making sure there's always enough space pre-allocated), with EIO type errors. The problem is different, the solution is different.</p> <hr /> <pre><code>From:Tomas Vondra &lt;tomas(dot)vondra(at)2ndquadrant(dot)com&gt; Date:2018-04-09 21:25:52 </code></pre> <p>On 04/09/2018 11:08 PM, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:</p> <blockquote> <p>I can also imagine a master and standby that are similarly provisioned, and thus hit an out of disk error at around the same time, resulting in corruption on both, even if not the same corruption.</p> </blockquote> <p>I think it's a grave mistake conflating ENOSPC issues (which we should solve by making sure there's always enough space pre-allocated), with EIO type errors. The problem is different, the solution is different.</p> </blockquote> <p>In any case, that certainly does not count as data corruption spreading from the master to standby.</p> <hr /> <pre><code>From:Mark Dilger &lt;hornschnorter(at)gmail(dot)com&gt; Date:2018-04-09 21:33:29 </code></pre> <blockquote> <p>On Apr 9, 2018, at 2:25 PM, Tomas Vondra wrote:</p> <p>On 04/09/2018 11:08 PM, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:</p> <blockquote> <p>I can also imagine a master and standby that are similarly provisioned, and thus hit an out of disk error at around the same time, resulting in corruption on both, even if not the same corruption.</p> </blockquote> <p>I think it's a grave mistake conflating ENOSPC issues (which we should solve by making sure there's always enough space pre-allocated), with EIO type errors. The problem is different, the solution is different.</p> </blockquote> </blockquote> <p>I'm happy to take your word for that.</p> <blockquote> <p>In any case, that certainly does not count as data corruption spreading from the master to standby.</p> </blockquote> <p>Maybe not from the point of view of somebody looking at the code. But a user might see it differently. If the data being loaded into the master and getting replicated to the standby &quot;causes&quot; both to get corrupt, then it seems like corruption spreading. I put &quot;causes&quot; in quotes because there is some argument to be made about &quot;correlation does not prove cause&quot; and so forth, but it still feels like causation from an arms length perspective. If there is a pattern of standby servers tending to fail more often right around the time that the master fails, you'll have a hard time comforting users, &quot;hey, it's not technically causation.&quot; If loading data into the master causes the master to hit ENOSPC, and replicating that data to the standby causes the standby to hit ENOSPC, and if the bug abound ENOSPC has not been fixed, then this looks like corruption spreading.</p> <p>I'm certainly planning on taking a hard look at the disk allocation on my standby servers right soon now.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-09 22:33:16 </code></pre> <p>On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>Well, there seem to be kernels that seem to do exactly that already. At least that's how I understand what this thread says about FreeBSD and Illumos, for example. So it's not an entirely insane design, apparently.</p> </blockquote> <p>It is reasonable, but even FreeBSD has a big fat comment right there (since 2017), mentioning that there can be no recovery from EIO at the block layer and this needs to be done differently. No idea how an application running on top of either FreeBSD or Illumos would actually recover from this error (and clear it out), other than remounting the fs in order to force dropping of relevant pages. It does provide though indeed a persistent error indication that would allow Pg to simply reliably panic. But again this does not necessarily play well with other applications that may be using the filesystem reliably at the same time, and are now faced with EIO while their own writes succeed to be persisted.</p> </blockquote> <p>Right. For anyone interested, here is the change you mentioned, and an interesting one that came a bit earlier last year:</p> <ul> <li><a href="https://reviews.freebsd.org/rS316941">https://reviews.freebsd.org/rS316941</a> -- drop buffers after device goes away</li> <li><a href="https://reviews.freebsd.org/rS326029">https://reviews.freebsd.org/rS326029</a> -- update comment about EIO contract</li> </ul> <p>Retrying may well be futile, but at least future fsync() calls won't report success bogusly. There may of course be more space-efficient ways to represent that state as the comment implies, while never lying to the user -- perhaps involving filesystem level or (pinned) inode level errors that stop all writes until unmounted. Something tells me they won't resort to flakey fsync() error reporting.</p> <p>I wonder if anyone can tell us what Windows, AIX and HPUX do here.</p> <blockquote> <p>[1] <a href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf">https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf</a></p> </blockquote> <p>Very interesting, thanks.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-10 00:32:20 </code></pre> <p>On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro wrote:</p> <blockquote> <p>I wonder if anyone can tell us what Windows, AIX and HPUX do here.</p> </blockquote> <p>I created a wiki page to track what we know (or think we know) about fsync() on various operating systems:</p> <p><a href="https://wiki.postgresql.org/wiki/Fsync_Errors">https://wiki.postgresql.org/wiki/Fsync_Errors</a></p> <p>If anyone has more information or sees mistakes, please go ahead and edit it.</p> <hr /> <pre><code>From:Andreas Karlsson &lt;andreas(at)proxel(dot)se&gt; Date:2018-04-10 00:41:10 </code></pre> <p>On 04/09/2018 02:16 PM, Craig Ringer wrote:</p> <blockquote> <p>I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. &quot;Tell us about anything that happens under pgdata/&quot; or an inotify-style per-directory-registration option. I'd even say that's ideal.</p> </blockquote> <p>Could there be a risk of a race condition here where fsync incorrectly returns success before we get the notification of that something went wrong?</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 01:44:59 </code></pre> <p>On 10 April 2018 at 03:59, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:</p> <blockquote> <p>You could make the argument that it's OK to forget if the entire file system goes away. But actually, why is that ok?</p> </blockquote> <p>I was going to say that it'd be okay to clear error flag on umount, since any opened files would prevent unmounting; but, then I realized we need to consider the case of close()ing all FDs then opening them later..in another process.</p> <p>On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:</p> <blockquote> <p>notification descriptor open, where the kernel would inject events related to writeback failures of files under watch (potentially enriched to contain info regarding the exact failed pages and the file offset they map to).</p> </blockquote> <p>For postgres that'd require backend processes to open() an file such that, following its close(), any writeback errors are &quot;signalled&quot; to the checkpointer process...</p> </blockquote> <p>I don't think that's as hard as some people argued in this thread. We could very well open a pipe in postmaster with the write end open in each subprocess, and the read end open only in checkpointer (and postmaster, but unused there). Whenever closing a file descriptor that was dirtied in the current process, send it over the pipe to the checkpointer. The checkpointer then can receive all those file descriptors (making sure it's not above the limit, fsync(), close() ing to make room if necessary). The biggest complication would presumably be to deduplicate the received filedescriptors for the same file, without loosing track of any errors.</p> </blockquote> <p>Yep. That'd be a cheaper way to do it, though it wouldn't work on Windows. Though we don't know how Windows behaves here at all yet.</p> <p>Prior discussion upthread had the checkpointer open()ing a file at the same time as a backend, before the backend writes to it. But passing the fd when the backend is done with it would be better.</p> <p>We'd need a way to dup() the fd and pass it back to a backend when it needed to reopen it sometimes, or just make sure to keep the oldest copy of the fd when a backend reopens multiple times, but that's no biggie.</p> <p>We'd still have to fsync() out early in the checkpointer if we ran out of space in our FD list, and initscripts would need to change our ulimit or we'd have to do it ourselves in the checkpointer. But neither seems insurmountable.</p> <p>FWIW, I agree that this is a corner case, but it's getting to be a pretty big corner with the spread of overcommitted, dedupliating SANs, cloud storage, etc. Not all I/O errors indicate permanent hardware faults, disk failures, etc, as I outlined earlier. I'm very curious to know what AWS EBS's error semantics are, and other cloud network block stores. (I posted on Amazon forums <a href="https://forums.aws.amazon.com/thread.jspa?threadID=279274&amp;tstart=0">https://forums.aws.amazon.com/thread.jspa?threadID=279274&amp;tstart=0</a> but nothing so far).</p> <p>I'm also not particularly inclined to trust that all file systems will always reliably reserve space without having some cases where they'll fail writeback on space exhaustion.</p> <p>So we don't need to panic and freak out, but it's worth looking at the direction the storage world is moving in, and whether this will become a bigger issue over time.</p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-10 01:52:21 </code></pre> <p>On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer wrote:</p> <blockquote> <p>On 10 April 2018 at 03:59, Andres Freund wrote:</p> <blockquote> <p>I don't think that's as hard as some people argued in this thread. We could very well open a pipe in postmaster with the write end open in each subprocess, and the read end open only in checkpointer (and postmaster, but unused there). Whenever closing a file descriptor that was dirtied in the current process, send it over the pipe to the checkpointer. The checkpointer then can receive all those file descriptors (making sure it's not above the limit, fsync(), close() ing to make room if necessary). The biggest complication would presumably be to deduplicate the received filedescriptors for the same file, without loosing track of any errors.</p> </blockquote> <p>Yep. That'd be a cheaper way to do it, though it wouldn't work on Windows. Though we don't know how Windows behaves here at all yet.</p> <p>Prior discussion upthread had the checkpointer open()ing a file at the same time as a backend, before the backend writes to it. But passing the fd when the backend is done with it would be better.</p> </blockquote> <p>How would that interlock with concurrent checkpoints?</p> <p>I can see how to make that work if the share-fd-or-fsync-now logic happens in smgrwrite() when called by FlushBuffer() while you hold io_in_progress, but not if you defer it to some random time later.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 01:54:30 </code></pre> <p>On 10 April 2018 at 04:25, Mark Dilger wrote:</p> <blockquote> <p>I was reading this thread up until now as meaning that the standby could receive corrupt WAL data and become corrupted.</p> </blockquote> <p>Yes, it can, but not directly through the first error.</p> <p>What can happen is that we think a block got written when it didn't.</p> <p>If our in memory state diverges from our on disk state, we can make subsequent WAL writes based on that wrong information. But that's actually OK, since the standby will have replayed the original WAL correctly.</p> <p>I think the only time we'd run into trouble is if we evict the good (but not written out) data from s_b and the fs buffer cache, then later read in the old version of a block we failed to overwrite. Data checksums (if enabled) might catch it unless the write left the whole block stale. In that case we might generate a full page write with the stale block and propagate that over WAL to the standby.</p> <p>So I'd say standbys are relatively safe - very safe if the issue is caught promptly, and less so over time. But AFAICS WAL-based replication (physical or logical) is not a perfect defense for this.</p> <p>However, remember, if your storage system is free of any sort of overprovisioning, is on a non-network file system, and doesn't use multipath (or sets it up right) this issue <em>is exceptionally unlikely to affect you</em>.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 01:59:03 </code></pre> <p>On 10 April 2018 at 04:37, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:</p> <blockquote> <p>Maybe. I'd certainly prefer automated recovery from an temporary I/O issues (like full disk on thin-provisioning) without the database crashing and restarting. But I'm not sure it's worth the effort.</p> </blockquote> <p>Oh, I agree on that one. But that's more a question of how we force the kernel's hand on allocating disk space. In most cases the kernel allocates the disk space immediately, even if delayed allocation is in effect. For the cases where that's not the case (if there are current ones, rather than just past bugs), we should be able to make sure that's not an issue by pre-zeroing the data and/or using fallocate.</p> </blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> <p>EXT4 and XFS don't allocate until later, it by performing actual writes to FS metadata, initializing disk blocks, etc. So we won't notice errors that are only detectable at actual time of allocation, like thin provisioning problems, until after write() returns and we face the same writeback issues.</p> <p>So I reckon you're safe from space-related issues if you're not on NFS (and whyyy would you do that?) and not thinly provisioned. I'm sure there are other corner cases, but I don't see any reason to expect space-exhaustion-related corruption problems on a sensible FS backed by a sensible block device. I haven't tested things like quotas, verified how reliable space reservation is under concurrency, etc as yet.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-10 02:00:59 </code></pre> <p>On April 9, 2018 6:59:03 PM PDT, Craig Ringer wrote:</p> <blockquote> <p>On 10 April 2018 at 04:37, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:</p> <blockquote> <p>Maybe. I'd certainly prefer automated recovery from an temporary I/O issues (like full disk on thin-provisioning) without the database crashing and restarting. But I'm not sure it's worth the effort.</p> </blockquote> <p>Oh, I agree on that one. But that's more a question of how we force the kernel's hand on allocating disk space. In most cases the kernel allocates the disk space immediately, even if delayed allocation is in effect. For the cases where that's not the case (if there are current ones, rather than just past bugs), we should be able to make sure that's not an issue by pre-zeroing the data and/or using fallocate.</p> </blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> <p>EXT4 and XFS don't allocate until later, it by performing actual writes to FS metadata, initializing disk blocks, etc. So we won't notice errors that are only detectable at actual time of allocation, like thin provisioning problems, until after write() returns and we face the same writeback issues.</p> <p>So I reckon you're safe from space-related issues if you're not on NFS (and whyyy would you do that?) and not thinly provisioned. I'm sure there are other corner cases, but I don't see any reason to expect space-exhaustion-related corruption problems on a sensible FS backed by a sensible block device. I haven't tested things like quotas, verified how reliable space reservation is under concurrency, etc as yet.</p> </blockquote> <p>How's that not solved by pre zeroing and/or fallocate as I suggested above?</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 02:02:48 </code></pre> <p>On 10 April 2018 at 08:41, Andreas Karlsson wrote:</p> <blockquote> <p>On 04/09/2018 02:16 PM, Craig Ringer wrote:</p> <blockquote> <p>I'd like a middle ground where the kernel lets us register our interest and tells us if it lost something, without us having to keep eight million FDs open for some long period. &quot;Tell us about anything that happens under pgdata/&quot; or an inotify-style per-directory-registration option. I'd even say that's ideal.</p> </blockquote> <p>Could there be a risk of a race condition here where fsync incorrectly returns success before we get the notification of that something went wrong?</p> </blockquote> <p>We'd examine the notification queue only once all our checkpoint fsync()s had succeeded, and before we updated the control file to advance the redo position.</p> <p>I'm intrigued by the suggestion upthread of using a kprobe or similar to achieve this. It's a horrifying unportable hack that'd make kernel people cry, and I don't know if we have any way to flush buffered probe data to be sure we really get the news in time, but it's a cool idea too.</p> <hr /> <pre><code>From:Michael Paquier &lt;michael(at)paquier(dot)xyz&gt; Date:2018-04-10 05:04:13 </code></pre> <p>On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:</p> <blockquote> <p>Another consequence of this behavior that initdb -S is never reliable, so pg_rewind's use of it doesn't actually fix the problem it was intended to solve. It also means that initdb itself isn't crash-safe, since the data file changes are made by the backend but initdb itself is doing the fsyncs, and initdb has no way of knowing what files the backend is going to create and therefore can't -- even theoretically -- open them first.</p> </blockquote> <p>And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb -S or fsync_pgdata would enter in those waters.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 05:37:19 </code></pre> <p>On 10 April 2018 at 13:04, Michael Paquier wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:</p> <blockquote> <p>Another consequence of this behavior that initdb -S is never reliable, so pg_rewind's use of it doesn't actually fix the problem it was intended to solve. It also means that initdb itself isn't crash-safe, since the data file changes are made by the backend but initdb itself is doing the fsyncs, and initdb has no way of knowing what files the backend is going to create and therefore can't -- even theoretically -- open them first.</p> </blockquote> <p>And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb -S or fsync_pgdata would enter in those waters.</p> </blockquote> <p>... but <em>only if they hit an I/O error</em> or they're on a FS that doesn't reserve space and hit ENOSPC.</p> <p>It still does 99% of the job. It still flushes all buffers to persistent storage and maintains write ordering. It may not detect and report failures to the user how we'd expect it to, yes, and that's not great. But it's hardly throw up our hands and give up territory either. Also, at least for initdb, we can make initdb fsync() its own files before close(). Annoying but hardly the end of the world.</p> <hr /> <pre><code>From:Michael Paquier &lt;michael(at)paquier(dot)xyz&gt; Date:2018-04-10 06:10:21 </code></pre> <p>On Tue, Apr 10, 2018 at 01:37:19PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 10 April 2018 at 13:04, Michael Paquier wrote:</p> <blockquote> <p>And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb -S or fsync_pgdata would enter in those waters.</p> </blockquote> <p>... but <em>only if they hit an I/O error</em> or they're on a FS that doesn't reserve space and hit ENOSPC.</p> </blockquote> <p>Sure.</p> <blockquote> <p>It still does 99% of the job. It still flushes all buffers to persistent storage and maintains write ordering. It may not detect and report failures to the user how we'd expect it to, yes, and that's not great. But it's hardly throw up our hands and give up territory either. Also, at least for initdb, we can make initdb fsync() its own files before close(). Annoying but hardly the end of the world.</p> </blockquote> <p>Well, I think that there is place for improving reporting of failure in file_utils.c for frontends, or at worst have an exit() for any kind of critical failures equivalent to a PANIC.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-10 12:15:15 </code></pre> <p>On 10 April 2018 at 14:10, Michael Paquier wrote:</p> <blockquote> <p>Well, I think that there is place for improving reporting of failure in file_utils.c for frontends, or at worst have an exit() for any kind of critical failures equivalent to a PANIC.</p> </blockquote> <p>Yup.</p> <p>In the mean time, speaking of PANIC, here's the first cut patch to make Pg panic on fsync() failures. I need to do some closer review and testing, but it's presented here for anyone interested.</p> <p>I intentionally left some failures as ERROR not PANIC, where the entire operation is done as a unit, and an ERROR will cause us to retry the whole thing.</p> <p>For example, when we fsync() a temp file before we move it into place, there's no point panicing on failure, because we'll discard the temp file on ERROR and retry the whole thing.</p> <p>I've verified that it works as expected with some modifications to the test tool I've been using (pushed).</p> <p>The main downside is that if we panic in redo, we don't try again. We throw our toys and shut down. But arguably if we get the same I/O error again in redo, that's the right thing to do anyway, and quite likely safer than continuing to ERROR on checkpoints indefinitely.</p> <p>Patch attached.</p> <p>To be clear, this patch only deals with the issue of us retrying fsyncs when it turns out to be unsafe. This does NOT address any of the issues where we won't find out about writeback errors at all.</p> <p>AttachmentContent-TypeSize v1-0001-PANIC-when-we-detect-a-possible-fsync-I-O-error-i.patchtext/x-patch10.3 KB</p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-10 15:15:46 </code></pre> <p>On Mon, Apr 9, 2018 at 3:13 PM, Andres Freund wrote:</p> <blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> </blockquote> <p>Well, I admit that I wasn't entirely serious about that email, but I wasn't entirely not-serious either. If you can't find reliably find out whether the contents of the file on disk are the same as the contents that the kernel is giving you when you call read(), then you are going to have a heck of a time building a reliable system. If the kernel developers are determined to insist on these semantics (and, admittedly, I don't know whether that's the case - I've only read Anthony's remarks), then I don't really see what we can do except give up on buffered I/O (or on Linux).</p> <blockquote> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>I think that reliable error reporting is more than &quot;nice&quot; -- I think it's essential. The only argument for the current Linux behavior that has been so far advanced on this thread, at least as far as I can see, is that if it kept retrying the buffers forever, it would be pointless and might run the machine out of memory, so we might as well discard them. But previous comments have already illustrated that the kernel is not really up against a wall there -- it could put individual inodes into a permanent failure state when it discards their dirty data, as you suggested, or it could do what others have suggested, and what I think is better, which is to put the whole filesystem into a permanent failure state that can be cleared by remounting the FS. That could be done on an as-needed basis -- if the number of dirty buffers you're holding onto for some filesystem becomes too large, put the filesystem into infinite-fail mode and discard them all. That behavior would be pretty easy for administrators to understand and would resolve the entire problem here provided that no PostgreSQL processes survived the eventual remount.</p> <p>I also don't really know what we mean by an &quot;unresolvable&quot; error. If the drive is beyond all hope, then it doesn't really make sense to talk about whether the database stored on it is corrupt. In general we can't be sure that we'll even get an error - e.g. the system could be idle and the drive could be on fire. Maybe this is the case you meant by &quot;it'd be nice if we could report it reliably&quot;. But at least in my experience, that's typically not what's going on. You get some I/O errors and so you remount the filesystem, or reboot, or rebuild the array, or ... something. And then the errors go away and, at that point, you want to run recovery and continue using your database. In this scenario, it matters <em>quite a bit</em> what the error reporting was like during the period when failures were occurring. In particular, if the database was allowed to think that it had successfully checkpointed when it didn't, you're going to start recovery from the wrong place.</p> <p>I'm going to shut up now because I'm telling you things that you obviously already know, but this doesn't sound like a &quot;near irresolvable corner case&quot;. When the storage goes bonkers, either PostgreSQL and the kernel can interact in such a way that a checkpoint can succeed without all of the relevant data getting persisted, or they don't. It sounds like right now they do, and I'm not really clear that we have a reasonable idea how to fix that. It does not sound like a PANIC is sufficient.</p> <hr /> <pre><code>From:Robert Haas &lt;robertmhaas(at)gmail(dot)com&gt; Date:2018-04-10 15:28:07 </code></pre> <p>On Tue, Apr 10, 2018 at 1:37 AM, Craig Ringer wrote:</p> <blockquote> <p>... but <em>only if they hit an I/O error</em> or they're on a FS that doesn't reserve space and hit ENOSPC.</p> <p>It still does 99% of the job. It still flushes all buffers to persistent storage and maintains write ordering. It may not detect and report failures to the user how we'd expect it to, yes, and that's not great. But it's hardly throw up our hands and give up territory either. Also, at least for initdb, we can make initdb fsync() its own files before close(). Annoying but hardly the end of the world.</p> </blockquote> <p>I think we'd need every child postgres process started by initdb to do that individually, which I suspect would slow down initdb quite a lot. Now admittedly for anybody other than a PostgreSQL developer that's only a minor issue, and our regression tests mostly run with fsync=off anyway. But I have a strong suspicion that our assumptions about how fsync() reports errors are baked into an awful lot of parts of the system, and by the time we get unbaking them I think it's going to be really surprising if we haven't done real harm to overall system performance.</p> <p>BTW, I took a look at the MariaDB source code to see whether they've got this problem too and it sure looks like they do. os_file_fsync_posix() retries the fsync in a loop with an 0.2 second sleep after each retry. It warns after 100 failures and fails an assertion after 1000 failures. It is hard to understand why they would have written the code this way unless they expect errors reported by fsync() to continue being reported until the underlying condition is corrected. But, it looks like they wouldn't have the problem that we do with trying to reopen files to fsync() them later -- I spot checked a few places where this code is invoked and in all of those it looks like the file is already expected to be open.</p> <hr /> <pre><code>From:Anthony Iliopoulos &lt;ailiop(at)altatus(dot)com&gt; Date:2018-04-10 15:40:05 </code></pre> <p>Hi Robert,</p> <p>On Tue, Apr 10, 2018 at 11:15:46AM -0400, Robert Haas wrote:</p> <blockquote> <p>On Mon, Apr 9, 2018 at 3:13 PM, Andres Freund wrote:</p> <blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> </blockquote> <p>Well, I admit that I wasn't entirely serious about that email, but I wasn't entirely not-serious either. If you can't find reliably find out whether the contents of the file on disk are the same as the contents that the kernel is giving you when you call read(), then you are going to have a heck of a time building a reliable system. If the kernel developers are determined to insist on these semantics (and, admittedly, I don't know whether that's the case - I've only read Anthony's remarks), then I don't really see what we can do except give up on buffered I/O (or on Linux).</p> </blockquote> <p>I think it would be interesting to get in touch with some of the respective linux kernel maintainers and open up this topic for more detailed discussions. LSF/MM'18 is upcoming and it would have been the perfect opportunity but it's past the CFP deadline. It may still worth contacting the organizers to bring forward the issue, and see if there is a chance to have someone from Pg invited for further discussions.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-10 16:38:27 </code></pre> <p>On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:</p> <blockquote> <p>On Mon, Apr 09, 2018 at 09:45:40AM +0100, Greg Stark wrote:</p> <blockquote> <p>On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:</p> </blockquote> <p>To make things a bit simpler, let us focus on EIO for the moment. The contract between the block layer and the filesystem layer is assumed to be that of, when an EIO is propagated up to the fs, then you may assume that all possibilities for recovering have been exhausted in lower layers of the stack.</p> </blockquote> <p>Well Postgres is using the filesystem. The interface between the block layer and the filesystem may indeed need to be more complex, I wouldn't know.</p> <p>But I don't think &quot;all possibilities&quot; is a very useful concept. Neither layer here is going to be perfect. They can only promise that all possibilities that have actually been implemented have been exhausted. And even among those only to the degree they can be done automatically within the engineering tradeoffs and constraints. There will always be cases like thin provisioned devices that an operator can expand, or degraded raid arrays that can be repaired after a long operation and so on. A network device can't be sure whether a remote server may eventually come back or not and have to be reconfigured by a human or system automation tool to point to the new server or new network configuration.</p> <blockquote> <p>Right. This implies though that apart from the kernel having to keep around the dirtied-but-unrecoverable pages for an unbounded time, that there's further an interface for obtaining the exact failed pages so that you can read them back.</p> </blockquote> <p>No, the interface we have is fsync which gives us that information with the granularity of a single file. The database could in theory recognize that fsync is not completing on a file and read that file back and write it to a new file. More likely we would implement a feature Oracle has of writing key files to multiple devices. But currently in practice that's not what would happen, what would happen would be a human would recognize that the database has stopped being able to commit and there are hardware errors in the log and would stop the database, take a backup, and restore onto a new working device. The current interface is that there's one error and then Postgres would pretty much have to say, &quot;sorry, your database is corrupt and the data is gone, restore from your backups&quot;. Which is pretty dismal.</p> <blockquote> <p>There is a clear responsibility of the application to keep its buffers around until a successful fsync(). The kernels do report the error (albeit with all the complexities of dealing with the interface), at which point the application may not assume that the write()s where ever even buffered in the kernel page cache in the first place.</p> </blockquote> <p>Postgres cannot just store the entire database in RAM. It writes things to the filesystem all the time. It calls fsync only when it needs a write barrier to ensure consistency. That's only frequent on the transaction log to ensure it's flushed before data modifications and then periodically to checkpoint the data files. The amount of data written between checkpoints can be arbitrarily large and Postgres has no idea how much memory is available as filesystem buffers or how much i/o bandwidth is available or other memory pressure there is. What you're suggesting is that the application should have to babysit the filesystem buffer cache and reimplement all of it in user-space because the filesystem is free to throw away any data any time it chooses?</p> <p>The current interface to throw away filesystem buffer cache is unmount. It sounds like the kernel would like a more granular way to discard just part of a device which makes a lot of sense in the age of large network block devices. But I don't think just saying that the filesystem buffer cache is now something every application needs to re-implement in user-space really helps with that, they're going to have the same problems to solve.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-10 16:54:40 </code></pre> <p>On 10 April 2018 at 02:59, Craig Ringer wrote:</p> <blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> </blockquote> <p>I'm kind of puzzled by this. Surely NFS servers store the data in the filesystem using write(2) or the in-kernel equivalent? So if the server is backed by a filesystem where write(2) preallocates space surely the NFS server must behave as if it'spreallocating as well? I would expect NFS to provide basically the same set of possible failures as the underlying filesystem (as long as you don't enable nosync of course).</p> <hr /> <pre><code>From:&quot;Joshua D(dot) Drake&quot; &lt;jd(at)commandprompt(dot)com&gt; Date:2018-04-10 18:58:37 </code></pre> <p>-hackers,</p> <p>I reached out to the Linux ext4 devs, here is tytso(at)mit(dot)edu response:</p> <p>&quot;&quot;&quot; Hi Joshua,</p> <p>This isn't actually an ext4 issue, but a long-standing VFS/MM issue.</p> <p>There are going to be multiple opinions about what the right thing to do. I'll try to give as unbiased a description as possible, but certainly some of this is going to be filtered by my own biases no matter how careful I can be.</p> <p>First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> <p>Which is why after a while, one can get quite paranoid and assume that the only way you can guarantee data robustness is to store multiple copies and/or use erasure encoding, with some of the copies or shards written to geographically diverse data centers.</p> <p>Secondly, I think it's fair to say that the vast majority of the companies who require data robustness, and are either willing to pay $$$ to an enterprise distro company like Red Hat, or command a large enough paying customer base that they can afford to dictate terms to an enterprise distro, or hire a consultant such as Christoph, or have their own staffed Linux kernel teams, have tended to use O_DIRECT. So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> <p>Next, the reason why fsync() has the behaviour that it does is one ofhe the most common cases of I/O storage errors in buffered use cases, certainly as seen by the community distros, is the user who pulls out USB stick while it is in use. In that case, if there are dirtied pages in the page cache, the question is what can you do? Sooner or later the writes will time out, and if you leave the pages dirty, then it effectively becomes a permanent memory leak. You can't unmount the file system --- that requires writing out all of the pages such that the dirty bit is turned off. And if you don't clear the dirty bit on an I/O error, then they can never be cleaned. You can't even re-insert the USB stick; the re-inserted USB stick will get a new block device. Worse, when the USB stick was pulled, it will have suffered a power drop, and see above about what could happen after a power drop for non-power fail certified flash devices --- it goes double for the cheap sh*t USB sticks found in the checkout aisle of Micro Center.</p> <p>So this is the explanation for why Linux handles I/O errors by clearing the dirty bit after reporting the error up to user space. And why there is not eagerness to solve the problem simply by &quot;don't clear the dirty bit&quot;. For every one Postgres installation that might have a better recover after an I/O error, there's probably a thousand clueless Fedora and Ubuntu users who will have a much worse user experience after a USB stick pull happens.</p> <p>I can think of things that could be done --- for example, it could be switchable on a per-block device basis (or maybe a per-mount basis) whether or not the dirty bit gets cleared after the error is reported to userspace. And perhaps there could be a new unmount flag that causes all dirty pages to be wiped out, which could be used to recover after a permanent loss of the block device. But the question is who is going to invest the time to make these changes? If there is a company who is willing to pay to comission this work, it's almost certainly soluble. Or if a company which has a kernel on staff is willing to direct an engineer to work on it, it certainly could be solved. But again, of the companies who have client code where we care about robustness and proper handling of failed disk drives, and which have a kernel team on staff, pretty much all of the ones I can think of (e.g., Oracle, Google, etc.) use O_DIRECT and they don't try to make buffered writes and error reporting via fsync(2) work well.</p> <p>In general these companies want low-level control over buffer cache eviction algorithms, which drives them towards the design decision of effectively implementing the page cache in userspace, and using O_DIRECT reads/writes.</p> <p>If you are aware of a company who is willing to pay to have a new kernel feature implemented to meet your needs, we might be able to refer you to a company or a consultant who might be able to do that work. Let me know off-line if that's the case...</p> <pre><code>- Ted </code></pre> <p>&quot;&quot;&quot;</p> <hr /> <pre><code>From:&quot;Joshua D(dot) Drake&quot; &lt;jd(at)commandprompt(dot)com&gt; Date:2018-04-10 19:51:01 </code></pre> <p>-hackers,</p> <p>The thread is picking up over on the ext4 list. They don't update their archives as often as we do, so I can't link to the discussion. What would be the preferred method of sharing the info?</p> <p>Thanks,</p> <hr /> <pre><code>From:&quot;Joshua D(dot) Drake&quot; &lt;jd(at)commandprompt(dot)com&gt; Date:2018-04-10 20:57:34 </code></pre> <p>On 04/10/2018 12:51 PM, Joshua D. Drake wrote:</p> <blockquote> <p>-hackers,</p> <p>The thread is picking up over on the ext4 list. They don't update their archives as often as we do, so I can't link to the discussion. What would be the preferred method of sharing the info?</p> </blockquote> <p>Thanks to Anthony for this link:</p> <p><a href="http://lists.openwall.net/linux-ext4/2018/04/10/33">http://lists.openwall.net/linux-ext4/2018/04/10/33</a></p> <p>It isn't quite real time but it keeps things close enough.</p> <hr /> <pre><code>From:Jonathan Corbet &lt;corbet(at)lwn(dot)net&gt; Date:2018-04-11 12:05:27 </code></pre> <p>On Tue, 10 Apr 2018 17:40:05 +0200 Anthony Iliopoulos wrote:</p> <blockquote> <p>LSF/MM'18 is upcoming and it would have been the perfect opportunity but it's past the CFP deadline. It may still worth contacting the organizers to bring forward the issue, and see if there is a chance to have someone from Pg invited for further discussions.</p> </blockquote> <p>FWIW, it is my current intention to be sure that the development community is at least aware of the issue by the time LSFMM starts.</p> <p>The event is April 23-25 in Park City, Utah. I bet that room could be found for somebody from the postgresql community, should there be somebody who would like to represent the group on this issue. Let me know if an introduction or advocacy from my direction would be helpful.</p> <hr /> <pre><code>From:Greg Stark &lt;stark(at)mit(dot)edu&gt; Date:2018-04-11 12:23:49 </code></pre> <p>On 10 April 2018 at 19:58, Joshua D. Drake wrote:</p> <blockquote> <p>You can't unmount the file system --- that requires writing out all of the pages such that the dirty bit is turned off.</p> </blockquote> <p>I always wondered why Linux didn't implement umount -f. It's been in BSD since forever and it's a major annoyance that it's missing in Linux. Even without leaking memory it still leaks other resources, causes confusion and awkward workarounds in UI and automation software.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-11 14:29:09 </code></pre> <p>Hi,</p> <p>On 2018-04-11 06:05:27 -0600, Jonathan Corbet wrote:</p> <blockquote> <p>The event is April 23-25 in Park City, Utah. I bet that room could be found for somebody from the postgresql community, should there be somebody who would like to represent the group on this issue. Let me know if an introduction or advocacy from my direction would be helpful.</p> </blockquote> <p>If that room can be found, I might be able to make it. Being in SF, I'm probably the physically closest PG dev involved in the discussion.</p> <p>Thanks for chiming in,</p> <hr /> <pre><code>From:Jonathan Corbet &lt;corbet(at)lwn(dot)net&gt; Date:2018-04-11 14:40:31 </code></pre> <p>On Wed, 11 Apr 2018 07:29:09 -0700 Andres Freund wrote:</p> <blockquote> <p>If that room can be found, I might be able to make it. Being in SF, I'm probably the physically closest PG dev involved in the discussion.</p> </blockquote> <p>OK, I've dropped the PC a note; hopefully you'll be hearing from them.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-17 21:19:53 </code></pre> <p>On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 10 April 2018 at 02:59, Craig Ringer wrote:</p> <blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> </blockquote> <p>I'm kind of puzzled by this. Surely NFS servers store the data in the filesystem using write(2) or the in-kernel equivalent? So if the server is backed by a filesystem where write(2) preallocates space surely the NFS server must behave as if it'spreallocating as well? I would expect NFS to provide basically the same set of possible failures as the underlying filesystem (as long as you don't enable nosync of course).</p> </blockquote> <p>I don't think the write is <em>sent</em> to the NFS at the time of the write, so while the NFS side would reserve the space, it might get the write request until after we return write success to the process.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-17 21:29:17 </code></pre> <p>On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>On 04/09/2018 12:29 AM, Bruce Momjian wrote:</p> <blockquote> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> </blockquote> <p>My more-considered crazy idea is to have a postgresql.conf setting like archive_command that allows the administrator to specify a command that will be run <em>after</em> fsync but before the checkpoint is marked as complete. While we can have write flush errors before fsync and never see the errors during fsync, we will not have write flush errors <em>after</em> fsync that are associated with previous writes.</p> <p>The script should check for I/O or space-exhaustion errors and return false in that case, in which case we can stop and maybe stop and crash recover. We could have an exit of 1 do the former, and an exit of 2 do the later.</p> <p>Also, if we are relying on WAL, we have to make sure WAL is actually safe with fsync, and I am betting only the O_DIRECT methods actually are safe:</p> <pre><code> #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync --&gt; # fdatasync (default on Linux) --&gt; # fsync --&gt; # fsync_writethrough # open_sync </code></pre> <p>I am betting the marked wal_sync_method methods are not safe since there is time between the write and fsync.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-17 21:32:45 </code></pre> <p>On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>On 04/09/2018 12:29 AM, Bruce Momjian wrote:</p> <blockquote> <p>An crazy idea would be to have a daemon that checks the logs and stops Postgres when it seems something wrong.</p> </blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> </blockquote> <p>Replying to your specific case, I am not sure how we would use a script to check for I/O errors/space-exhaustion if the postgres user doesn't have access to it. Does O_DIRECT work in such container cases?</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-17 21:34:53 </code></pre> <p>On 2018-04-17 17:29:17 -0400, Bruce Momjian wrote:</p> <blockquote> <p>Also, if we are relying on WAL, we have to make sure WAL is actually safe with fsync, and I am betting only the O_DIRECT methods actually are safe:</p> <pre><code> &gt; #wal_sync_method = fsync # the default is the first option &gt; # supported by the operating system: &gt; # open_datasync &gt; --&gt; # fdatasync (default on Linux) &gt; --&gt; # fsync &gt; --&gt; # fsync_writethrough &gt; # open_sync </code></pre> <p>I am betting the marked wal_sync_method methods are not safe since there is time between the write and fsync.</p> </blockquote> <p>Hm? That's not really the issue though? One issue is that retries are not necessarily safe in buffered IO, the other that fsync might not report an error if the fd was closed and opened.</p> <p>O_DIRECT is only used if wal archiving or streaming isn't used, which makes it pretty useless anyway.</p> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-17 21:41:42 </code></pre> <p>On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote:</p> <blockquote> <p>On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> </blockquote> </blockquote> <p>You can certainly have access to the kernel log in containers. I'd assume such a script wouldn't check various system logs but instead tail /dev/kmsg or such. Otherwise the variance between installations would be too big.</p> <p>There's not <em>that</em> many different type of error messages and they don't change that often. If we'd just detect error for the most common FSs we'd probably be good. Detecting a few general storage layer message wouldn't be that hard either, most things have been unified over the last ~8-10 years.</p> <blockquote> <p>Replying to your specific case, I am not sure how we would use a script to check for I/O errors/space-exhaustion if the postgres user doesn't have access to it.</p> </blockquote> <p>Not sure what you mean?</p> <p>Space exhaustiion can be checked when allocating space, FWIW. We'd just need to use posix_fallocate et al.</p> <blockquote> <p>Does O_DIRECT work in such container cases?</p> </blockquote> <p>Yes.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-17 21:49:42 </code></pre> <p>On Mon, Apr 9, 2018 at 12:25:33PM -0700, Peter Geoghegan wrote:</p> <blockquote> <p>On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund wrote:</p> <blockquote> <p>Let's lower the pitchforks a bit here. Obviously a grand rewrite is absurd, as is some of the proposed ways this is all supposed to work. But I think the case we're discussing is much closer to a near irresolvable corner case than anything else.</p> </blockquote> <p>+1</p> <blockquote> <p>We're talking about the storage layer returning an irresolvable error. You're hosed even if we report it properly. Yes, it'd be nice if we could report it reliably. But that doesn't change the fact that what we're doing is ensuring that data is safely fsynced unless storage fails, in which case it's not safely fsynced anyway.</p> </blockquote> <p>Right. We seem to be implicitly assuming that there is a big difference between a problem in the storage layer that we could in principle detect, but don't, and any other problem in the storage layer. I've read articles claiming that technologies like SMART are not really reliable in a practical sense [1], so it seems to me that there is reason to doubt that this gap is all that big.</p> <p>That said, I suspect that the problems with running out of disk space are serious practical problems. I have personally scoffed at stories involving Postgres databases corruption that gets attributed to running out of disk space. Looks like I was dead wrong.</p> </blockquote> <p>Yes, I think we need to look at user expectations here.</p> <p>If the device has a hardware write error, it is true that it is good to detect it, and it might be permanent or temporary, e.g. NAS/NFS. The longer the error persists, the more likely the user will expect corruption. However, right now, any length outage could cause corruption, and it will not be reported in all cases.</p> <p>Running out of disk space is also something you don't expect to corrupt your database --- you expect it to only prevent future writes. It seems NAS/NFS and any thin provisioned storage will have this problem, and again, not always reported.</p> <p>So, our initial action might just be to educate users that write errors can cause silent corruption, and out-of-space errors on NAS/NFS and any thin provisioned storage can cause corruption.</p> <p>Kernel logs (not just Postgres logs) should be monitored for these issues and fail-over/recovering might be necessary.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-18 09:52:22 </code></pre> <p>On Tue, Apr 17, 2018 at 02:34:53PM -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-17 17:29:17 -0400, Bruce Momjian wrote:</p> <blockquote> <p>Also, if we are relying on WAL, we have to make sure WAL is actually safe with fsync, and I am betting only the O_DIRECT methods actually are safe:</p> <pre><code>&gt; &gt; #wal_sync_method = fsync # the default is the first option &gt; &gt; # supported by the operating system: &gt; &gt; # open_datasync &gt; &gt; --&gt; # fdatasync (default on Linux) &gt; &gt; --&gt; # fsync &gt; &gt; --&gt; # fsync_writethrough &gt; &gt; # open_sync </code></pre> <p>I am betting the marked wal_sync_method methods are not safe since there is time between the write and fsync.</p> </blockquote> <p>Hm? That's not really the issue though? One issue is that retries are not necessarily safe in buffered IO, the other that fsync might not report an error if the fd was closed and opened.</p> </blockquote> <p>Well, we have have been focusing on the delay between backend or checkpoint writes and checkpoint fsyncs. My point is that we have the same problem in doing a write, <em>then</em> fsync for the WAL. Yes, the delay is much shorter, but the issue still exists. I realize that newer Linux kernels will not have the problem since the file descriptor remains open, but the problem exists with older/common linux kernels.</p> <blockquote> <p>O_DIRECT is only used if wal archiving or streaming isn't used, which makes it pretty useless anyway.</p> </blockquote> <p>Uh, as doesn't 'open_datasync' and 'open_sync' fsync as part of the write, meaning we can't lose the error report like we can with the others?</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-18 10:04:30 </code></pre> <p>On 18 April 2018 at 05:19, Bruce Momjian wrote:</p> <blockquote> <p>On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 10 April 2018 at 02:59, Craig Ringer wrote:</p> <blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> </blockquote> <p>I'm kind of puzzled by this. Surely NFS servers store the data in the filesystem using write(2) or the in-kernel equivalent? So if the server is backed by a filesystem where write(2) preallocates space surely the NFS server must behave as if it'spreallocating as well? I would expect NFS to provide basically the same set of possible failures as the underlying filesystem (as long as you don't enable nosync of course).</p> </blockquote> <p>I don't think the write is <em>sent</em> to the NFS at the time of the write, so while the NFS side would reserve the space, it might get the write request until after we return write success to the process.</p> </blockquote> <p>It should be sent if you're using sync mode.</p> <p>From my reading of the docs, if you're using async mode you're already open to so many potential corruptions you might as well not bother.</p> <p>I need to look into this more re NFS and expand the tests I have to cover that properly.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-18 10:19:28 </code></pre> <p>On 10 April 2018 at 20:15, Craig Ringer wrote:</p> <blockquote> <p>On 10 April 2018 at 14:10, Michael Paquier wrote:</p> <blockquote> <p>Well, I think that there is place for improving reporting of failure in file_utils.c for frontends, or at worst have an exit() for any kind of critical failures equivalent to a PANIC.</p> </blockquote> <p>Yup.</p> <p>In the mean time, speaking of PANIC, here's the first cut patch to make Pg panic on fsync() failures. I need to do some closer review and testing, but it's presented here for anyone interested.</p> <p>I intentionally left some failures as ERROR not PANIC, where the entire operation is done as a unit, and an ERROR will cause us to retry the whole thing.</p> <p>For example, when we fsync() a temp file before we move it into place, there's no point panicing on failure, because we'll discard the temp file on ERROR and retry the whole thing.</p> <p>I've verified that it works as expected with some modifications to the test tool I've been using (pushed).</p> <p>The main downside is that if we panic in redo, we don't try again. We throw our toys and shut down. But arguably if we get the same I/O error again in redo, that's the right thing to do anyway, and quite likely safer than continuing to ERROR on checkpoints indefinitely.</p> <p>Patch attached.</p> <p>To be clear, this patch only deals with the issue of us retrying fsyncs when it turns out to be unsafe. This does NOT address any of the issues where we won't find out about writeback errors at all.</p> </blockquote> <p>Thinking about this some more, it'll definitely need a GUC to force it to continue despite a potential hazard. Otherwise we go backwards from the status quo if we're in a position where uptime is vital and correctness problems can be tolerated or repaired later. Kind of like zero_damaged_pages, we'll need some sort of continue_after_fsync_errors .</p> <p>Without that, we'll panic once, enter redo, and if the problem persists we'll panic in redo and exit the startup process. That's not going to help users.</p> <p>I'll amend the patch accordingly as time permits.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-18 11:46:15 </code></pre> <p>On Wed, Apr 18, 2018 at 06:04:30PM +0800, Craig Ringer wrote:</p> <blockquote> <p>On 18 April 2018 at 05:19, Bruce Momjian wrote:</p> <blockquote> <p>On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:</p> <blockquote> <p>On 10 April 2018 at 02:59, Craig Ringer wrote:</p> <blockquote> <p>Nitpick: In most cases the kernel reserves disk space immediately, before returning from write(). NFS seems to be the main exception here.</p> </blockquote> <p>I'm kind of puzzled by this. Surely NFS servers store the data in the filesystem using write(2) or the in-kernel equivalent? So if the server is backed by a filesystem where write(2) preallocates space surely the NFS server must behave as if it'spreallocating as well? I would expect NFS to provide basically the same set of possible failures as the underlying filesystem (as long as you don't enable nosync of course).</p> </blockquote> <p>I don't think the write is <em>sent</em> to the NFS at the time of the write, so while the NFS side would reserve the space, it might get the write request until after we return write success to the process.</p> </blockquote> <p>It should be sent if you're using sync mode.</p> <blockquote> <p>From my reading of the docs, if you're using async mode you're already open to so many potential corruptions you might as well not bother.</p> </blockquote> <p>I need to look into this more re NFS and expand the tests I have to cover that properly.</p> </blockquote> <p>So, if sync mode passes the write to NFS, and NFS pre-reserves write space, and throws an error on reservation failure, that means that NFS will not corrupt a cluster on out-of-space errors.</p> <p>So, what about thin provisioning? I can understand sharing <em>free</em> space among file systems, but once a write arrives I assume it reserves the space. Is the problem that many thin provisioning systems don't have a sync mode, so you can't force the write to appear on the device before an fsync?</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-18 11:56:57 </code></pre> <p>On Tue, Apr 17, 2018 at 02:41:42PM -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote:</p> <blockquote> <p>On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:</p> <blockquote> <p>That doesn't seem like a very practical way. It's better than nothing, of course, but I wonder how would that work with containers (where I think you may not have access to the kernel log at all). Also, I'm pretty sure the messages do change based on kernel version (and possibly filesystem) so parsing it reliably seems rather difficult. And we probably don't want to PANIC after I/O error on an unrelated device, so we'd need to understand which devices are related to PostgreSQL.</p> </blockquote> </blockquote> <p>You can certainly have access to the kernel log in containers. I'd assume such a script wouldn't check various system logs but instead tail /dev/kmsg or such. Otherwise the variance between installations would be too big.</p> </blockquote> <p>I was thinking 'dmesg', but the result is similar.</p> <blockquote> <p>There's not <em>that</em> many different type of error messages and they don't change that often. If we'd just detect error for the most common FSs we'd probably be good. Detecting a few general storage layer message wouldn't be that hard either, most things have been unified over the last ~8-10 years.</p> </blockquote> <p>It is hard to know exactly what the message format should be for each operating system because it is hard to generate them on demand, and we would need to filter based on Postgres devices.</p> <p>The other issue is that once you see a message during a checkpoint and exit, you don't want to see that message again after the problem has been fixed and the server restarted. The simplest solution is to save the output of the last check and look for only new entries. I am attaching a script I run every 15 minutes from cron that emails me any unexpected kernel messages.</p> <p>I am thinking we would need a contrib module with sample scripts for various operating systems.</p> <blockquote> <blockquote> <p>Replying to your specific case, I am not sure how we would use a script to check for I/O errors/space-exhaustion if the postgres user doesn't have access to it.</p> </blockquote> <p>Not sure what you mean?</p> <p>Space exhaustiion can be checked when allocating space, FWIW. We'd just need to use posix_fallocate et al.</p> </blockquote> <p>I was asking about cases where permissions prevent viewing of kernel messages. I think you can view them in containers, but in virtual machines you might not have access to the host operating system's kernel messages, and that might be where they are.</p> <pre><code> AttachmentContent-TypeSize dmesg_checktext/plain574 bytes </code></pre> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-18 12:45:53 </code></pre> <p>wrOn 18 April 2018 at 19:46, Bruce Momjian wrote:</p> <blockquote> <p>So, if sync mode passes the write to NFS, and NFS pre-reserves write space, and throws an error on reservation failure, that means that NFS will not corrupt a cluster on out-of-space errors.</p> </blockquote> <p>Yeah. I need to verify in a concrete test case.</p> <p>The thing is that write() is allowed to be asynchronous anyway. Most file systems choose to implement eager reservation of space, but it's not mandated. AFAICS that's largely a historical accident to keep applications happy, because FSes used to <em>allocate</em> the space at write() time too, and when they moved to delayed allocations, apps tended to break too easily unless they at least reserved space. NFS would have to do a round-trip on write() to reserve space.</p> <p>The Linux man pages (<a href="http://man7.org/linux/man-pages/man2/write.2.html">http://man7.org/linux/man-pages/man2/write.2.html</a>) say:</p> <blockquote> <p>A successful return from write() does not make any guarantee that data has been committed to disk. On some filesystems, including NFS, it does not even guarantee that space has successfully been reserved for the data. In this case, some errors might be delayed until a future write(2), fsync(2), or even close(2). The only way to be sure is to call fsync(2) after you are done writing all your data.</p> </blockquote> <p>... and I'm inclined to believe it when it refuses to make guarantees. Especially lately.</p> <blockquote> <p>So, what about thin provisioning? I can understand sharing <em>free</em> space among file systems</p> </blockquote> <p>Most thin provisioning is done at the block level, not file system level. So the FS is usually unaware it's on a thin-provisioned volume. Usually the whole kernel is unaware, because the thin provisioning is done on the SAN end or by a hypervisor. But the same sort of thing may be done via LVM - see lvmthin. For example, you may make 100 different 1TB ext4 FSes, each on 1TB iSCSI volumes backed by SAN with a total of 50TB of concrete physical capacity. The SAN is doing block mapping and only allocating storage chunks to a given volume when the FS has written blocks to every previous free block in the previous storage chunk. It may also do things like block de-duplication, compression of storage chunks that aren't written to for a while, etc.</p> <p>The idea is that when the SAN's actual physically allocate storage gets to 40TB it starts telling you to go buy another rack of storage so you don't run out. You don't have to resize volumes, resize file systems, etc. All the storage space admin is centralized on the SAN and storage team, and your sysadmins, DBAs and app devs are none the wiser. You buy storage when you need it, not when the DBA demands they need a 200% free space margin just in case. Whether or not you agree with this philosophy or think it's sensible is kind of moot, because it's an extremely widespread model, and servers you work on may well be backed by thin provisioned storage <em>even if you don't know it</em>.</p> <p>Think of it as a bit like VM overcommit, for storage. You can malloc() as much memory as you like and everything's fine until you try to actually use it. Then you go to dirty a page, no free pages are available, and <em>boom</em>.</p> <p>The thing is, the SAN (or LVM) doesn't have any idea about the FS's internal in-memory free space counter and its space reservations. Nor does it understand any FS metadata. All it cares about is &quot;has this LBA ever been written to by the FS?&quot;. If so, it must make sure backing storage for it exists. If not, it won't bother.</p> <p>Most FSes only touch the blocks on dirty writeback, or sometimes lazily as part of delayed allocation. So if your SAN is running out of space and there's 100MB free, each of your 100 FSes may have decremented its freelist by 2MB and be happily promising more space to apps on write() because, well, as far as they know they're only 50% full. When they all do dirty writeback and flush to storage, kaboom, there's nowhere to put some of the data.</p> <p>I don't know if posix_fallocate is a sufficient safeguard either. You'd have to actually force writes to each page through to the backing storage to know for sure the space existed. Yes, the docs say</p> <blockquote> <p>After a successful call to posix_fallocate(), subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space.</p> </blockquote> <p>... but they're speaking from the filesystem's perspective. If the FS doesn't dirty and flush the actual blocks, a thin provisioned storage system won't know.</p> <p>It's reasonable enough to throw up our hands in this case and say &quot;your setup is crazy, you're breaking the rules, don't do that&quot;. The truth is they AREN'T breaking the rules, but we can disclaim support for such configurations anyway.</p> <p>After all, we tell people not to use Linux's VM overcommit too. How's that working for you? I see it enabled on the great majority of systems I work with, and some people are very reluctant to turn it off because they don't want to have to add swap.</p> <p>If someone has a 50TB SAN and wants to allow for unpredictable space use expansion between various volumes, and we say &quot;you can't do that, go buy a 100TB SAN instead&quot; ... that's not going to go down too well either. Often we can actually say &quot;make sure the 5TB volume PostgreSQL is using is eagerly provisioned, and expand it at need using online resize if required. We don't care about the rest of the SAN.&quot;.</p> <p>I guarantee you that when you create a 100GB EBS volume on AWS EC2, you don't get 100GB of storage preallocated. AWS are probably pretty good about not running out of backing store, though.</p> <p>There <em>are</em> file systems optimised for thin provisioning, etc, too. But that's more commonly done by having them do things like zero deallocated space so the thin provisioning system knows it can return it to the free pool, and now things like DISCARD provide much of that signalling in a standard way.</p> <hr /> <pre><code>From:Mark Kirkwood &lt;mark(dot)kirkwood(at)catalyst(dot)net(dot)nz&gt; Date:2018-04-18 23:31:50 </code></pre> <p>On 19/04/18 00:45, Craig Ringer wrote:</p> <blockquote> <p>I guarantee you that when you create a 100GB EBS volume on AWS EC2, you don't get 100GB of storage preallocated. AWS are probably pretty good about not running out of backing store, though.</p> </blockquote> <p>Some db folks (used to anyway) advise dd'ing to your freshly attached devices on AWS (for performance mainly IIRC), but that would help prevent some failure scenarios for any thin provisioned storage (but probably really annoy the admins' thereof).</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-19 00:44:33 </code></pre> <p>On 19 April 2018 at 07:31, Mark Kirkwood wrote:</p> <blockquote> <p>On 19/04/18 00:45, Craig Ringer wrote:</p> <blockquote> <p>I guarantee you that when you create a 100GB EBS volume on AWS EC2, you don't get 100GB of storage preallocated. AWS are probably pretty good about not running out of backing store, though.</p> </blockquote> <p>Some db folks (used to anyway) advise dd'ing to your freshly attached devices on AWS (for performance mainly IIRC), but that would help prevent some failure scenarios for any thin provisioned storage (but probably really annoy the admins' thereof).</p> </blockquote> <p>This still makes a lot of sense on AWS EBS, particularly when using a volume created from a non-empty snapshot. Performance of S3-snapshot based EBS volumes is spectacularly awful, since they're copy-on-read. Reading the whole volume helps a lot.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-20 20:49:08 </code></pre> <p>On Wed, Apr 18, 2018 at 08:45:53PM +0800, Craig Ringer wrote:</p> <blockquote> <p>wrOn 18 April 2018 at 19:46, Bruce Momjian wrote:</p> <blockquote> <p>So, if sync mode passes the write to NFS, and NFS pre-reserves write space, and throws an error on reservation failure, that means that NFS will not corrupt a cluster on out-of-space errors.</p> </blockquote> <p>Yeah. I need to verify in a concrete test case.</p> </blockquote> <p>Thanks.</p> <blockquote> <p>The thing is that write() is allowed to be asynchronous anyway. Most file systems choose to implement eager reservation of space, but it's not mandated. AFAICS that's largely a historical accident to keep applications happy, because FSes used to <em>allocate</em> the space at write() time too, and when they moved to delayed allocations, apps tended to break too easily unless they at least reserved space. NFS would have to do a round-trip on write() to reserve space.</p> <p>The Linux man pages (<a href="http://man7.org/linux/man-pages/man2/write.2.html">http://man7.org/linux/man-pages/man2/write.2.html</a>) say:</p> <p>&quot; A successful return from write() does not make any guarantee that data has been committed to disk. On some filesystems, including NFS, it does not even guarantee that space has successfully been reserved for the data. In this case, some errors might be delayed until a future write(2), fsync(2), or even close(2). The only way to be sure is to call fsync(2) after you are done writing all your data. &quot;</p> <p>... and I'm inclined to believe it when it refuses to make guarantees. Especially lately.</p> </blockquote> <p>Uh, even calling fsync after write isn't 100% safe since the kernel could have flushed the dirty pages to storage, and failed, and the fsync would later succeed. I realize newer kernels have that fixed for files open during that operation, but that is the minority of installs.</p> <blockquote> <p>The idea is that when the SAN's actual physically allocate storage gets to 40TB it starts telling you to go buy another rack of storage so you don't run out. You don't have to resize volumes, resize file systems, etc. All the storage space admin is centralized on the SAN and storage team, and your sysadmins, DBAs and app devs are none the wiser. You buy storage when you need it, not when the DBA demands they need a 200% free space margin just in case. Whether or not you agree with this philosophy or think it's sensible is kind of moot, because it's an extremely widespread model, and servers you work on may well be backed by thin provisioned storage <em>even if you don't know it</em>.</p> <p>Most FSes only touch the blocks on dirty writeback, or sometimes lazily as part of delayed allocation. So if your SAN is running out of space and there's 100MB free, each of your 100 FSes may have decremented its freelist by 2MB and be happily promising more space to apps on write() because, well, as far as they know they're only 50% full. When they all do dirty writeback and flush to storage, kaboom, there's nowhere to put some of the data.</p> </blockquote> <p>I see what you are saying --- that the kernel is reserving the write space from its free space, but the free space doesn't all exist. I am not sure how we can tell people to make sure the file system free space is real.</p> <blockquote> <p>You'd have to actually force writes to each page through to the backing storage to know for sure the space existed. Yes, the docs say</p> <p>&quot; After a successful call to posix_fallocate(), subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space. &quot;</p> <p>... but they're speaking from the filesystem's perspective. If the FS doesn't dirty and flush the actual blocks, a thin provisioned storage system won't know.</p> </blockquote> <p>Frankly, in what cases will a write fail <em>for</em> lack of free space? It could be a new WAL file (not recycled), or a pages added to the end of the table.</p> <p>Is that it? It doesn't sound too terrible. If we can eliminate the corruption due to free space exxhaustion, it would be a big step forward.</p> <p>The next most common failure would be temporary storage failure or storage communication failure.</p> <p>Permanent storage failure is &quot;game over&quot; so we don't need to worry about that.</p> <hr /> <pre><code>From:Gasper Zejn &lt;zejn(at)owca(dot)info&gt; Date:2018-04-21 19:21:39 </code></pre> <p>Just for the record, I tried the test case with ZFS on Ubuntu 17.10 host with ZFS on Linux 0.6.5.11.</p> <p>ZFS does not swallow the fsync error, but the system does not handle the error nicely: the test case program hangs on fsync, the load jumps up and there's a bunch of z_wr_iss and z_null_int kernel threads belonging to zfs, eating up the CPU.</p> <p>Even then I managed to reboot the system, so it's not a complete and utter mess.</p> <p>The test case adjustments are here: <a href="https://github.com/zejn/scrapcode/commit/e7612536c346d59a4b69bedfbcafbe8c1079063c">https://github.com/zejn/scrapcode/commit/e7612536c346d59a4b69bedfbcafbe8c1079063c</a></p> <p>Kind regards,</p> <hr /> <p>On 29. 03. 2018 07:25, Craig Ringer wrote:</p> <blockquote> <p>On 29 March 2018 at 13:06, Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com</p> <pre><code>On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby &gt; The retries are the source of the problem ; the first fsync() can return EIO, &gt; and also *clears the error* causing a 2nd fsync (of the same data) to return &gt; success. &gt; What I'm failing to grok here is how that error flag even matters, &gt; whether it's a single bit or a counter as described in that patch. If &gt; write back failed, *the page is still dirty*. So all future calls to &gt; fsync() need to try to try to flush it again, and (presumably) fail &gt; again (unless it happens to succeed this time around). </code></pre> <p>You'd think so. But it doesn't appear to work that way. You can see yourself with the error device-mapper destination mapped over part of a volume.</p> <p>I wrote a test case here.</p> <p><a href="https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c">https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c</a></p> <p>I don't pretend the kernel behaviour is sane. And it's possible I've made an error in my analysis. But since I've observed this in the wild, and seen it in a test case, I strongly suspect that's what I've described is just what's happening, brain-dead or no.</p> <p>Presumably the kernel marks the page clean when it dispatches it to the I/O subsystem and doesn't dirty it again on I/O error? I haven't dug that deep on the kernel side. See the stackoverflow post for details on what I found in kernel code analysis.</p> </blockquote> <hr /> <pre><code>From:Andres Freund &lt;andres(at)anarazel(dot)de&gt; Date:2018-04-23 20:14:48 </code></pre> <p>Hi,</p> <p>On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> <p>But then we retried the checkpoint, which retried the fsync(). The retry succeeded, because the prior fsync() <em>cleared the AS_EIO bad page flag</em>.</p> </blockquote> <p>Random other thing we should look at: Some filesystems (nfs yes, xfs ext4 no) flush writes at close(2). We check close() return code, just log it... So close() counts as an fsync for such filesystems().</p> <p>I'm LSF/MM to discuss future behaviour of linux here, but that's how it is right now.</p> <hr /> <pre><code>From:Bruce Momjian &lt;bruce(at)momjian(dot)us&gt; Date:2018-04-24 00:09:23 </code></pre> <p>On Mon, Apr 23, 2018 at 01:14:48PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> <p>But then we retried the checkpoint, which retried the fsync(). The retry succeeded, because the prior fsync() <em>cleared the AS_EIO bad page flag</em>.</p> </blockquote> <p>Random other thing we should look at: Some filesystems (nfs yes, xfs ext4 no) flush writes at close(2). We check close() return code, just log it... So close() counts as an fsync for such filesystems().</p> </blockquote> <p>Well, that's interesting. You might remember that NFS does not reserve space for writes like local file systems like ext4/xfs do. For that reason, we might be able to capture the out-of-space error on close and exit sooner for NFS.</p> <hr /> <pre><code>From:Craig Ringer &lt;craig(at)2ndquadrant(dot)com&gt; Date:2018-04-26 02:16:52 </code></pre> <p>On 24 April 2018 at 04:14, Andres Freund wrote:</p> <blockquote> <p>I'm LSF/MM to discuss future behaviour of linux here, but that's how it is right now.</p> </blockquote> <p>Interim LWN.net coverage of that can be found here: <a href="https://lwn.net/Articles/752613/">https://lwn.net/Articles/752613/</a></p> <hr /> <pre><code>From:Thomas Munro &lt;thomas(dot)munro(at)enterprisedb(dot)com&gt; Date:2018-04-27 01:18:55 </code></pre> <p>On Tue, Apr 24, 2018 at 12:09 PM, Bruce Momjian wrote:</p> <blockquote> <p>On Mon, Apr 23, 2018 at 01:14:48PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:</p> <blockquote> <p>TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK at least on Linux. When fsync() returns success it means &quot;all writes since the last fsync have hit disk&quot; but we assume it means &quot;all writes since the last SUCCESSFUL fsync have hit disk&quot;.</p> <p>But then we retried the checkpoint, which retried the fsync(). The retry succeeded, because the prior fsync() <em>cleared the AS_EIO bad page flag</em>.</p> </blockquote> <p>Random other thing we should look at: Some filesystems (nfs yes, xfs ext4 no) flush writes at close(2). We check close() return code, just log it... So close() counts as an fsync for such filesystems().</p> </blockquote> <p>Well, that's interesting. You might remember that NFS does not reserve space for writes like local file systems like ext4/xfs do. For that reason, we might be able to capture the out-of-space error on close and exit sooner for NFS.</p> </blockquote> <p>It seems like some implementations flush on close and therefore discover ENOSPC problem at that point, unless they have NVSv4 (RFC 3050) &quot;write delegation&quot; with a promise from the server that a certain amount of space is available. It seems like you can't count on that in any way though, because it's the server that decides when to delegate and how much space to promise is preallocated, not the client. So in userspace you always need to be able to handle errors including ENOSPC returned by close(), and if you ignore that and you're using an operating system that immediately incinerates all evidence after telling you that (so that later fsync() doesn't fail), you're in trouble.</p> <p>Some relevant code:</p> <ul> <li><a href="https://github.com/torvalds/linux/commit/5445b1fbd123420bffed5e629a420aa2a16bf849">https://github.com/torvalds/linux/commit/5445b1fbd123420bffed5e629a420aa2a16bf849</a></li> <li><a href="https://github.com/freebsd/freebsd/blob/master/sys/fs/nfsclient/nfs_clvnops.c#L618">https://github.com/freebsd/freebsd/blob/master/sys/fs/nfsclient/nfs_clvnops.c#L618</a></li> </ul> <p>It looks like the bleeding edge of the NFS spec includes a new ALLOCATE operation that should be able to support posix_fallocate() (if we were to start using that for extending files):</p> <p><a href="https://tools.ietf.org/html/rfc7862#page-64">https://tools.ietf.org/html/rfc7862#page-64</a></p> <p>I'm not sure how reliable [posix_]fallocate is on NFS in general though, and it seems that there are fall-back implementations of posix_fallocate() that write zeros (or even just feign success?) which probably won't do anything useful here if not also flushed (that fallback strategy might only work on eager reservation filesystems that don't have direct fallocate support?) so there are several layers (libc, kernel, nfs client, nfs server) that'd need to be aligned for that to work, and it's not clear how a humble userspace program is supposed to know if they are.</p> <p>I guess if you could find a way to amortise the cost of extending (like Oracle et al do by extending big container datafiles 10MB at a time or whatever), then simply writing zeros and flushing when doing that might work out OK, so you wouldn't need such a thing? (Unless of course it's a COW filesystem, but that's a different can of worms.)</p> <hr /> <p><em>This thread continues on the <code>ext4</code> mailing list:</em> <hr /></p> <pre><code>From: &quot;Joshua D. Drake&quot; &lt;jd@...mandprompt.com&gt; Subject: fsync() errors is unsafe and risks data loss Date: Tue, 10 Apr 2018 09:28:15 -0700 </code></pre> <p>-ext4,</p> <p>If this is not the appropriate list please point me in the right direction. I am a PostgreSQL contributor and we have come across a reliability problem with writes and fsync(). You can see the thread here:</p> <p><a href="https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz">https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz</a></p> <p>The tl;dr; in the first message doesn't quite describe the problem as we started to dig into it further.</p> <hr /> <pre><code>From: &quot;Darrick J. Wong&quot; &lt;darrick.wong@...cle.com&gt; Date: Tue, 10 Apr 2018 09:54:43 -0700 </code></pre> <p>On Tue, Apr 10, 2018 at 09:28:15AM -0700, Joshua D. Drake wrote:</p> <blockquote> <p>-ext4,</p> <p>If this is not the appropriate list please point me in the right direction. I am a PostgreSQL contributor and we have come across a reliability problem with writes and fsync(). You can see the thread here:</p> <p><a href="https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz">https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz</a></p> <p>The tl;dr; in the first message doesn't quite describe the problem as we started to dig into it further.</p> </blockquote> <p>You might try the XFS list (linux-xfs@...r.kernel.org) seeing as the initial complaint is against xfs behaviors...</p> <hr /> <pre><code>From: &quot;Joshua D. Drake&quot; &lt;jd@...mandprompt.com&gt; Date: Tue, 10 Apr 2018 09:58:21 -0700 </code></pre> <p>On 04/10/2018 09:54 AM, Darrick J. Wong wrote:</p> <blockquote> <p>On Tue, Apr 10, 2018 at 09:28:15AM -0700, Joshua D. Drake wrote:</p> <blockquote> <p>-ext4,</p> <p>If this is not the appropriate list please point me in the right direction. I am a PostgreSQL contributor and we have come across a reliability problem with writes and fsync(). You can see the thread here:</p> <p><a href="https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz">https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz</a></p> <p>The tl;dr; in the first message doesn't quite describe the problem as we started to dig into it further.</p> </blockquote> <p>You might try the XFS list (linux-xfs@...r.kernel.org) seeing as the initial complaint is against xfs behaviors...</p> </blockquote> <p>Later in the thread it becomes apparent that it applies to ext4 (NFS too) as well. I picked ext4 because I assumed it is the most populated of the lists since its the default filesystem for most distributions.</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Tue, 10 Apr 2018 14:43:56 -0400 </code></pre> <p>Hi Joshua,</p> <p>This isn't actually an ext4 issue, but a long-standing VFS/MM issue.</p> <p>There are going to be multiple opinions about what the right thing to do. I'll try to give as unbiased a description as possible, but certainly some of this is going to be filtered by my own biases no matter how careful I can be.</p> <p>First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> <p>Which is why after a while, one can get quite paranoid and assume that the only way you can guarantee data robustness is to store multiple copies and/or use erasure encoding, with some of the copies or shards written to geographically diverse data centers.</p> <p>Secondly, I think it's fair to say that the vast majority of the companies who require data robustness, and are either willing to pay $$$ to an enterprise distro company like Red Hat, or command a large enough paying customer base that they can afford to dictate terms to an enterprise distro, or hire a consultant such as Christoph, or have their own staffed Linux kernel teams, have tended to use O_DIRECT. So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> <p>Next, the reason why fsync() has the behaviour that it does is one ofhe the most common cases of I/O storage errors in buffered use cases, certainly as seen by the community distros, is the user who pulls out USB stick while it is in use. In that case, if there are dirtied pages in the page cache, the question is what can you do? Sooner or later the writes will time out, and if you leave the pages dirty, then it effectively becomes a permanent memory leak. You can't unmount the file system --- that requires writing out all of the pages such that the dirty bit is turned off. And if you don't clear the dirty bit on an I/O error, then they can never be cleaned. You can't even re-insert the USB stick; the re-inserted USB stick will get a new block device. Worse, when the USB stick was pulled, it will have suffered a power drop, and see above about what could happen after a power drop for non-power fail certified flash devices --- it goes double for the cheap sh*t USB sticks found in the checkout aisle of Micro Center.</p> <p>So this is the explanation for why Linux handles I/O errors by clearing the dirty bit after reporting the error up to user space. And why there is not eagerness to solve the problem simply by &quot;don't clear the dirty bit&quot;. For every one Postgres installation that might have a better recover after an I/O error, there's probably a thousand clueless Fedora and Ubuntu users who will have a much worse user experience after a USB stick pull happens.</p> <p>I can think of things that could be done --- for example, it could be switchable on a per-block device basis (or maybe a per-mount basis) whether or not the dirty bit gets cleared after the error is reported to userspace. And perhaps there could be a new unmount flag that causes all dirty pages to be wiped out, which could be used to recover after a permanent loss of the block device. But the question is who is going to invest the time to make these changes? If there is a company who is willing to pay to comission this work, it's almost certainly soluble. Or if a company which has a kernel on staff is willing to direct an engineer to work on it, it certainly could be solved. But again, of the companies who have client code where we care about robustness and proper handling of failed disk drives, and which have a kernel team on staff, pretty much all of the ones I can think of (e.g., Oracle, Google, etc.) use O_DIRECT and they don't try to make buffered writes and error reporting via fsync(2) work well.</p> <p>In general these companies want low-level control over buffer cache eviction algorithms, which drives them towards the design decision of effectively implementing the page cache in userspace, and using O_DIRECT reads/writes.</p> <p>If you are aware of a company who is willing to pay to have a new kernel feature implemented to meet your needs, we might be able to refer you to a company or a consultant who might be able to do that work. Let me know off-line if that's the case...</p> <hr /> <pre><code>From: Andreas Dilger &lt;adilger@...ger.ca&gt; Date: Tue, 10 Apr 2018 13:44:48 -0600 </code></pre> <p>On Apr 10, 2018, at 10:50 AM, Joshua D. Drake <a href="mailto:jd@...mandprompt.com">jd@...mandprompt.com</a> wrote:</p> <blockquote> <p>-ext4,</p> <p>If this is not the appropriate list please point me in the right direction. I am a PostgreSQL contributor and we have come across a reliability problem with writes and fsync(). You can see the thread here:</p> <p><a href="https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz">https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyz</a></p> <p>The tl;dr; in the first message doesn't quite describe the problem as we started to dig into it further.</p> </blockquote> <p>Yes, this is a very long thread. The summary is Postgres is unhappy that fsync() on Linux (and also other OSes) returns an error once if there was a prior write() failure, instead of keeping dirty pages in memory forever and trying to rewrite them.</p> <p>This behaviour has existed on Linux forever, and (for better or worse) is the only reasonable behaviour that the kernel can take. I've argued for the opposite behaviour at times, and some subsystems already do limited retries before finally giving up on a failed write, though there are also times when retrying at lower levels is pointless if a higher level of code can handle the failure (e.g. mirrored block devices, filesystem data mirroring, userspace data mirroring, or cross-node replication).</p> <p>The confusion is whether fsync() is a &quot;level&quot; state (return error forever if there were pages that could not be written), or an &quot;edge&quot; state (return error only for any write failures since the previous fsync() call).</p> <p>I think Anthony Iliopoulos was pretty clear in his multiple descriptions in that thread of why the current behaviour is needed (OOM of the whole system if dirty pages are kept around forever), but many others were stuck on &quot;I can't believe this is happening??? This is totally unacceptable and every kernel needs to change to match my expectations!!!&quot; without looking at the larger picture of what is practical to change and where the issue should best be fixed.</p> <p>Regardless of why this is the case, the net is that PG needs to deal with all of the systems that currently exist that have this behaviour, even if some day in the future it may change (though that is unlikely). It seems ironic that &quot;keep dirty pages in userspace until fsync() returns success&quot; is totally unacceptable, but &quot;keep dirty pages in the kernel&quot; is fine. My (limited) understanding of databases was that they preferred to cache everything in userspace and use O_DIRECT to write to disk (which returns an error immediately if the write fails and does not double buffer data).</p> <hr /> <p>From: Martin Steigerwald <a href="mailto:martin@...htvoll.de">martin@...htvoll.de</a> Date: Tue, 10 Apr 2018 21:47:21 +0200</p> <p>Hi Theodore, Darrick, Joshua.</p> <p>CC´d fsdevel as it does not appear to be Ext4 specific to me (and to you as well, Theodore).</p> <p>Theodore Y. Ts'o - 10.04.18, 20:43:</p> <blockquote> <p>This isn't actually an ext4 issue, but a long-standing VFS/MM issue. […] First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> </blockquote> <p>Guh. I was not aware of this. I knew consumer-grade SSDs often do not have power loss protection, but still thought they´d handle garble collection in an atomic way. Sometimes I am tempted to sing an &quot;all hardware is crap&quot; song (starting with Meltdown/Spectre, then probably heading over to storage devices and so on… including firmware crap like Intel ME).</p> <blockquote> <p>Next, the reason why fsync() has the behaviour that it does is one ofhe the most common cases of I/O storage errors in buffered use cases, certainly as seen by the community distros, is the user who pulls out USB stick while it is in use. In that case, if there are dirtied pages in the page cache, the question is what can you do? Sooner or later the writes will time out, and if you leave the pages dirty, then it effectively becomes a permanent memory leak. You can't unmount the file system --- that requires writing out all of the pages such that the dirty bit is turned off. And if you don't clear the dirty bit on an I/O error, then they can never be cleaned. You can't even re-insert the USB stick; the re-inserted USB stick will get a new block device. Worse, when the USB stick was pulled, it will have suffered a power drop, and see above about what could happen after a power drop for non-power fail certified flash devices --- it goes double for the cheap sh*t USB sticks found in the checkout aisle of Micro Center.</p> <p>From the original PostgreSQL mailing list thread I did not get on how exactly FreeBSD differs in behavior, compared to Linux. I am aware of one operating system that from a user point of view handles this in almost the right way IMHO: AmigaOS.</p> </blockquote> <p>When you removed a floppy disk from the drive while the OS was writing to it it showed a &quot;You MUST insert volume somename into drive somedrive:&quot; and if you did, it just continued writing. (The part that did not work well was that with the original filesystem if you did not insert it back, the whole disk was corrupted, usually to the point beyond repair, so the &quot;MUST&quot; was no joke.)</p> <p>In my opinion from a user´s point of view this is the only sane way to handle the premature removal of removable media. I have read of a GSoC project to implement something like this for NetBSD but I did not check on the outcome of it. But in MS-DOS I think there has been something similar, however MS-DOS is not an multitasking operating system as AmigaOS is.</p> <p>Implementing something like this for Linux would be quite a feat, I think, cause in addition to the implementation in the kernel, the desktop environment or whatever other userspace you use would need to handle it as well, so you´d have to adapt udev / udisks / probably Systemd. And probably this behavior needs to be restricted to anything that is really removable and even then in order to prevent memory exhaustion in case processes continue to write to an removed and not yet re-inserted USB harddisk the kernel would need to halt I/O processes which dirty I/O to this device. (I believe this is what AmigaOS did. It just blocked all subsequent I/O to the device still it was re-inserted. But then the I/O handling in that OS at that time is quite different from what Linux does.)</p> <blockquote> <p>So this is the explanation for why Linux handles I/O errors by clearing the dirty bit after reporting the error up to user space. And why there is not eagerness to solve the problem simply by &quot;don't clear the dirty bit&quot;. For every one Postgres installation that might have a better recover after an I/O error, there's probably a thousand clueless Fedora and Ubuntu users who will have a much worse user experience after a USB stick pull happens.</p> </blockquote> <p>I was not aware that flash based media may be as crappy as you hint at.</p> <blockquote> <p>From my tests with AmigaOS 4.something or AmigaOS 3.9 + 3rd Party Poseidon USB stack the above mechanism worked even with USB sticks. I however did not test this often and I did not check for data corruption after a test.</p> </blockquote> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Tue, 10 Apr 2018 15:07:26 -0700 </code></pre> <p>(Sorry if I screwed up the thread structure - I'd to reconstruct the reply-to and CC list from web archive as I've not found a way to properly download an mbox or such of old content. Was subscribed to fsdevel but not ext4 lists)</p> <p>Hi,</p> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>I'll try to give as unbiased a description as possible, but certainly some of this is going to be filtered by my own biases no matter how careful I can be.</p> </blockquote> <p>Same ;)</p> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> </blockquote> <p>That's a bit of a cop out. It's not just databases that care. Even more basic tools like SCM, package managers and editors care whether they can proper responses back from fsync that imply things actually were synced.</p> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So this is the explanation for why Linux handles I/O errors by clearing the dirty bit after reporting the error up to user space. And why there is not eagerness to solve the problem simply by &quot;don't clear the dirty bit&quot;. For every one Postgres installation that might have a better recover after an I/O error, there's probably a thousand clueless Fedora and Ubuntu users who will have a much worse user experience after a USB stick pull happens.</p> </blockquote> <p>I don't think these necessarily are as contradictory goals as you paint them. At least in postgres' case we can deal with the fact that an fsync retry isn't going to fix the problem by reentering crash recovery or just shutting down - therefore we don't need to keep all the dirty buffers around. A per-inode or per-superblock bit that causes further fsyncs to fail would be entirely sufficent for that.</p> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> <p>Both in postgres, <em>and</em> a lot of other applications, it's not at all guaranteed to consistently have one FD open for every file writtten. Therefore even the more recent per-fd errseq logic doesn't guarantee that the failure will ever be seen by an application diligently fsync()ing.</p> <p>You'd not even need to have per inode information or such in the case that the block device goes away entirely. As the FS isn't generally unmounted in that case, you could trivially keep a per-mount (or superblock?) bit that says &quot;I died&quot; and set that instead of keeping per inode/whatever information.</p> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>If you are aware of a company who is willing to pay to have a new kernel feature implemented to meet your needs, we might be able to refer you to a company or a consultant who might be able to do that work.</p> </blockquote> <p>I find it a bit dissapointing response. I think it's fair to say that for advanced features, but we're talking about the basic guarantee that fsync actually does something even remotely reasonable.</p> <p>2018-04-10 19:44:48 Andreas wrote:</p> <blockquote> <p>The confusion is whether fsync() is a &quot;level&quot; state (return error forever if there were pages that could not be written), or an &quot;edge&quot; state (return error only for any write failures since the previous fsync() call).</p> </blockquote> <p>I don't think that's the full issue. We can deal with the fact that an fsync failure is edge-triggered if there's a guarantee that every process doing so would get it. The fact that one needs to have an FD open from before any failing writes occurred to get a failure, <em>THAT'S</em> the big issue.</p> <p>Beyond postgres, it's a pretty common approach to do work on a lot of files without fsyncing, then iterate over the directory fsync everything, and <em>then</em> assume you're safe. But unless I severaly misunderstand something that'd only be safe if you kept an FD for every file open, which isn't realistic for pretty obvious reasons.</p> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>I think Anthony Iliopoulos was pretty clear in his multiple descriptions in that thread of why the current behaviour is needed (OOM of the whole system if dirty pages are kept around forever), but many others were stuck on &quot;I can't believe this is happening??? This is totally unacceptable and every kernel needs to change to match my expectations!!!&quot; without looking at the larger picture of what is practical to change and where the issue should best be fixed.</p> </blockquote> <p>Everone can participate in discussions...</p> <hr /> <pre><code>From: Andreas Dilger &lt;adilger@...ger.ca&gt; Date: Wed, 11 Apr 2018 15:52:44 -0600 </code></pre> <p>On Apr 10, 2018, at 4:07 PM, Andres Freund <a href="mailto:andres@...razel.de">andres@...razel.de</a> wrote:</p> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> </blockquote> <p>That's a bit of a cop out. It's not just databases that care. Even more basic tools like SCM, package managers and editors care whether they can proper responses back from fsync that imply things actually were synced.</p> </blockquote> <p>Sure, but it is mostly PG that is doing (IMHO) crazy things like writing to thousands(?) of files, closing the file descriptors, then expecting fsync() on a newly-opened fd to return a historical error. If an editor tries to write a file, then calls fsync and gets an error, the user will enter a new pathname and retry the write. The package manager will assume the package installation failed, and uninstall the parts of the package that were already written.</p> <p>There is no way the filesystem can handle the package manager failure case, and keeping the pages dirty and retrying indefinitely may never work (e.g. disk is dead or disconnected, is a sparse volume without any free space, etc). This (IMHO) implies that the higher layer (which knows more about what the write failure implies) needs to deal with this.</p> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So this is the explanation for why Linux handles I/O errors by clearing the dirty bit after reporting the error up to user space. And why there is not eagerness to solve the problem simply by &quot;don't clear the dirty bit&quot;. For every one Postgres installation that might have a better recover after an I/O error, there's probably a thousand clueless Fedora and Ubuntu users who will have a much worse user experience after a USB stick pull happens.</p> </blockquote> <p>I don't think these necessarily are as contradictory goals as you paint them. At least in postgres' case we can deal with the fact that an fsync retry isn't going to fix the problem by reentering crash recovery or just shutting down - therefore we don't need to keep all the dirty buffers around. A per-inode or per-superblock bit that causes further fsyncs to fail would be entirely sufficent for that.</p> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> <p>Consider if there was a per-inode &quot;there was once an error writing this inode&quot; flag. Then fsync() would return an error on the inode forever, since there is no way in POSIX to clear this state, since it would need to be kept in case some new fd is opened on the inode and does an fsync() and wants the error to be returned.</p> <p>IMHO, the only alternative would be to keep the dirty pages in memory until they are written to disk. If that was not possible, what then? It would need a reboot to clear the dirty pages, or truncate the file (discarding all data)?</p> <blockquote> <p>Both in postgres, <em>and</em> a lot of other applications, it's not at all guaranteed to consistently have one FD open for every file written. Therefore even the more recent per-fd errseq logic doesn't guarantee that the failure will ever be seen by an application diligently fsync()ing.</p> </blockquote> <p>... only if the application closes all fds for the file before calling fsync. If any fd is kept open from the time of the failure, it will return the original error on fsync() (and then no longer return it).</p> <p>It's not that you need to keep every fd open forever. You could put them into a shared pool, and re-use them if the file is &quot;re-opened&quot;, and call fsync on each fd before it is closed (because the pool is getting too big or because you want to flush the data for that file, or shut down the DB). That wouldn't require a huge re-architecture of PG, just a small library to handle the shared fd pool.</p> <p>That might even improve performance, because opening and closing files is itself not free, especially if you are working with remote filesystems.</p> <blockquote> <p>You'd not even need to have per inode information or such in the case that the block device goes away entirely. As the FS isn't generally unmounted in that case, you could trivially keep a per-mount (or superblock?) bit that says &quot;I died&quot; and set that instead of keeping per inode/whatever information.</p> </blockquote> <p>The filesystem will definitely return an error in this case, I don't think this needs any kind of changes:</p> <p>int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { if (unlikely(ext4_forced_shutdown(EXT4_SB(inode-&gt;i_sb)))) return -EIO;</p> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>If you are aware of a company who is willing to pay to have a new kernel feature implemented to meet your needs, we might be able to refer you to a company or a consultant who might be able to do that work.</p> </blockquote> <p>I find it a bit dissapointing response. I think it's fair to say that for advanced features, but we're talking about the basic guarantee that fsync actually does something even remotely reasonable.</p> </blockquote> <p>Linux (as PG) is run by people who develop it for their own needs, or are paid to develop it for the needs of others. Everyone already has too much work to do, so you need to find someone who has an interest in fixing this (IMHO very peculiar) use case. If PG developers want to add a tunable &quot;keep dirty pages in RAM on IO failure&quot;, I don't think that it would be too hard for someone to do. It might be harder to convince some of the kernel maintainers to accept it, and I've been on the losing side of that battle more than once. However, like everything you don't pay for, you can't require someone else to do this for you. It wouldn't hurt to see if Jeff Layton, who wrote the errseq patches, would be interested to work on something like this.</p> <p>That said, <em>even</em> if a fix was available for Linux tomorrow, it would be <em>years</em> before a majority of users would have it available on their system, that includes even the errseq mechanism that was landed a few months ago. That implies to me that you'd want something that fixes PG <em>now</em> so that it works around whatever (perceived) breakage exists in the Linux fsync() implementation. Since the thread indicates that non-Linux kernels have the same fsync() behaviour, it makes sense to do that even if the Linux fix was available.</p> <blockquote> <p>2018-04-10 19:44:48 Andreas wrote:</p> <blockquote> <p>The confusion is whether fsync() is a &quot;level&quot; state (return error forever if there were pages that could not be written), or an &quot;edge&quot; state (return error only for any write failures since the previous fsync() call).</p> </blockquote> <p>I don't think that's the full issue. We can deal with the fact that an fsync failure is edge-triggered if there's a guarantee that every process doing so would get it. The fact that one needs to have an FD open from before any failing writes occurred to get a failure, <em>THAT'S</em> the big issue.</p> <p>Beyond postgres, it's a pretty common approach to do work on a lot of files without fsyncing, then iterate over the directory fsync everything, and <em>then</em> assume you're safe. But unless I severaly misunderstand something that'd only be safe if you kept an FD for every file open, which isn't realistic for pretty obvious reasons.</p> </blockquote> <p>I can't say how common or uncommon such a workload is, though PG is the only application that I've heard of doing it, and I've been working on filesystems for 20 years. I'm a bit surprised that anyone expects fsync() on a newly-opened fd to have any state from write() calls that predate the open. I can understand fsync() returning an error for any IO that happens within the context of that fsync(), but how far should it go back for reporting errors on that file? Forever? The only way to clear the error would be to reboot the system, since I'm not aware of any existing POSIX code to clear such an error</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Thu, 12 Apr 2018 10:09:16 +1000 </code></pre> <p>On Wed, Apr 11, 2018 at 03:52:44PM -0600, Andreas Dilger wrote: &gt; On Apr 10, 2018, at 4:07 PM, Andres Freund <a href="mailto:andres@...razel.de">andres@...razel.de</a> wrote: &gt; &gt; 2018-04-10 18:43:56 Ted wrote: &gt; &gt;&gt; So for better or for worse, there has not been as much investment in &gt; &gt;&gt; buffered I/O and data robustness in the face of exception handling of &gt; &gt;&gt; storage devices. &gt; &gt; &gt; &gt; That's a bit of a cop out. It's not just databases that care. Even more &gt; &gt; basic tools like SCM, package managers and editors care whether they can &gt; &gt; proper responses back from fsync that imply things actually were synced. &gt; &gt; Sure, but it is mostly PG that is doing (IMHO) crazy things like writing &gt; to thousands(?) of files, closing the file descriptors, then expecting &gt; fsync() on a newly-opened fd to return a historical error.</p> <p>Yeah, this seems like a recipe for disaster, especially on cross-platform code where every OS platform behaves differently and almost never to expectation.</p> <p>And speaking of &quot;behaving differently to expectations&quot;, nobody has mentioned that close() can also return write errors. Hence if you do write - close - open - fsync the the write error might get reported on close, not fsync. IOWs, the assumption that &quot;async writeback errors will persist across close to open&quot; is fundamentally broken to begin with. It's even documented as a slient data loss vector in the close(2) man page:</p> <pre><code>$ man 2 close ..... Dealing with error returns from close() A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close() that releases the open file description. Failing to check the return value when closing a file may lead to silent loss of data. This can especially be observed with NFS and with disk quota. </code></pre> <p>Yeah, ensuring data integrity in the face of IO errors is a really hard problem. :/</p> <p>To pound the broken record: there are many good reasons why Linux filesystem developers have said &quot;you should use direct IO&quot; to the PG devs each time we have this &quot;the kernel doesn't do [complex things PG needs]&quot; discussion.</p> <p>In this case, robust IO error reporting is easy with DIO. It's one of the reasons most of the high performance database engines are either using or moving to non-blocking AIO+DIO (RWF_NOWAIT) and use O_DSYNC/RWF_DSYNC for integrity-critical IO dispatch. This is also being driven by the availability of high performance, high IOPS solid state storage where buffering in RAM to optimise IO patterns and throughput provides no real performance benefit.</p> <p>Using the AIO+DIO infrastructure ensures errors are reported for the specific write that fails at failure time (i.e. in the aio completion event for the specific IO), yet high IO throughput can be maintained without the application needing it's own threading infrastructure to prevent blocking.</p> <p>This means the application doesn't have to guess where the write error occurred to retry/recover, have to handle async write errors on close(), have to use fsync() to gather write IO errors and then infer where the IO failure was, or require kernels on every supported platform to jump through hoops to try to do exactly the right thing in error conditions for everyone in all circumstances at all times....</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Wed, 11 Apr 2018 19:17:52 -0700 </code></pre> <p>On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote:</p> <blockquote> <p>On Apr 10, 2018, at 4:07 PM, Andres Freund <a href="mailto:andres@...razel.de">andres@...razel.de</a> wrote:</p> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> </blockquote> <p>That's a bit of a cop out. It's not just databases that care. Even more basic tools like SCM, package managers and editors care whether they can proper responses back from fsync that imply things actually were synced.</p> </blockquote> <p>Sure, but it is mostly PG that is doing (IMHO) crazy things like writing to thousands(?) of files, closing the file descriptors, then expecting fsync() on a newly-opened fd to return a historical error.</p> </blockquote> <p>It's not just postgres. dpkg (underlying apt, on debian derived distros) to take an example I just randomly guessed, does too:</p> <pre><code> /* We want to guarantee the extracted files are on the disk, so that the * subsequent renames to the info database do not end up with old or zero * length files in case of a system crash. As neither dpkg-deb nor tar do * explicit fsync()s, we have to do them here. * XXX: This could be avoided by switching to an internal tar extractor. */ dir_sync_contents(cidir); </code></pre> <p>(a bunch of other places too)</p> <p>Especially on ext3 but also on newer filesystems it's performancewise entirely infeasible to fsync() every single file individually - the performance becomes entirely attrocious if you do that.</p> <p>I think there's some legitimate arguments that a database should use direct IO (more on that as a reply to David), but claiming that all sorts of random utilities need to use DIO with buffering etc is just insane.</p> <blockquote> <p>If an editor tries to write a file, then calls fsync and gets an error, the user will enter a new pathname and retry the write. The package manager will assume the package installation failed, and uninstall the parts of the package that were already written.</p> </blockquote> <p>Except that they won't notice that they got a failure, at least in the dpkg case. And happily continue installing corrupted data</p> <blockquote> <p>There is no way the filesystem can handle the package manager failure case, and keeping the pages dirty and retrying indefinitely may never work (e.g. disk is dead or disconnected, is a sparse volume without any free space, etc). This (IMHO) implies that the higher layer (which knows more about what the write failure implies) needs to deal with this.</p> </blockquote> <p>Yea, I agree that'd not be sane. As far as I understand the dpkg code (all of 10min reading it), that'd also be unnecessary. It can abort the installation, but only if it detects the error. Which isn't happening.</p> <blockquote> <blockquote> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> <p>Or even more extreme, you untar/zip/git clone a directory. Then do a sync. And you don't know whether anything actually succeeded.</p> <blockquote> <p>Consider if there was a per-inode &quot;there was once an error writing this inode&quot; flag. Then fsync() would return an error on the inode forever, since there is no way in POSIX to clear this state, since it would need to be kept in case some new fd is opened on the inode and does an fsync() and wants the error to be returned.</p> </blockquote> <p>The data in the file also is corrupt. Having to unmount or delete the file to reset the fact that it can't safely be assumed to be on disk isn't insane.</p> <blockquote> <blockquote> <p>Both in postgres, <em>and</em> a lot of other applications, it's not at all guaranteed to consistently have one FD open for every file written. Therefore even the more recent per-fd errseq logic doesn't guarantee that the failure will ever be seen by an application diligently fsync()ing.</p> </blockquote> <p>... only if the application closes all fds for the file before calling fsync. If any fd is kept open from the time of the failure, it will return the original error on fsync() (and then no longer return it).</p> <p>It's not that you need to keep every fd open forever. You could put them into a shared pool, and re-use them if the file is &quot;re-opened&quot;, and call fsync on each fd before it is closed (because the pool is getting too big or because you want to flush the data for that file, or shut down the DB). That wouldn't require a huge re-architecture of PG, just a small library to handle the shared fd pool.</p> </blockquote> <p>Except that postgres uses multiple processes. And works on a lot of architectures. If we started to fsync all opened files on process exit our users would <em>lynch</em> us. We'd need a complicated scheme that sends processes across sockets between processes, then deduplicate them on the receiving side, somehow figuring out which is the oldest filedescriptors (handling clockdrift safely).</p> <p>Note that it'd be perfectly fine that we've &quot;thrown away&quot; the buffer contents if we'd get notified that the fsync failed. We could just do WAL replay, and restore the contents (just was we do after crashes and/or for replication).</p> <blockquote> <p>That might even improve performance, because opening and closing files is itself not free, especially if you are working with remote filesystems.</p> </blockquote> <p>There's already a per-process cache of open files.</p> <blockquote> <blockquote> <p>You'd not even need to have per inode information or such in the case that the block device goes away entirely. As the FS isn't generally unmounted in that case, you could trivially keep a per-mount (or superblock?) bit that says &quot;I died&quot; and set that instead of keeping per inode/whatever information.</p> </blockquote> <p>The filesystem will definitely return an error in this case, I don't think this needs any kind of changes:</p> <p>int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { if (unlikely(ext4_forced_shutdown(EXT4_SB(inode-&gt;i_sb)))) return -EIO;</p> </blockquote> <p>Well, I'm making that argument because several people argued that throwing away buffer contents in this case is the only way to not cause OOMs, and that that's incompatible with reporting errors. It's clearly not...</p> <blockquote> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>If you are aware of a company who is willing to pay to have a new kernel feature implemented to meet your needs, we might be able to refer you to a company or a consultant who might be able to do that work.</p> </blockquote> <p>I find it a bit dissapointing response. I think it's fair to say that for advanced features, but we're talking about the basic guarantee that fsync actually does something even remotely reasonable.</p> </blockquote> <p>Linux (as PG) is run by people who develop it for their own needs, or are paid to develop it for the needs of others.</p> </blockquote> <p>Sure.</p> <blockquote> <p>Everyone already has too much work to do, so you need to find someone who has an interest in fixing this (IMHO very peculiar) use case. If PG developers want to add a tunable &quot;keep dirty pages in RAM on IO failure&quot;, I don't think that it would be too hard for someone to do. It might be harder to convince some of the kernel maintainers to accept it, and I've been on the losing side of that battle more than once. However, like everything you don't pay for, you can't require someone else to do this for you. It wouldn't hurt to see if Jeff Layton, who wrote the errseq patches, would be interested to work on something like this.</p> </blockquote> <p>I don't think this is that PG specific, as explained above.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Wed, 11 Apr 2018 19:32:21 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:</p> <blockquote> <p>To pound the broken record: there are many good reasons why Linux filesystem developers have said &quot;you should use direct IO&quot; to the PG devs each time we have this &quot;the kernel doesn't do [complex things PG needs]&quot; discussion.</p> </blockquote> <p>I personally am on board with doing that. But you also gotta recognize that an efficient DIO usage is a metric ton of work, and you need a large amount of differing logic for different platforms. It's just not realistic to do so for every platform. Postgres is developed by a small number of people, isn't VC backed etc. The amount of resources we can throw at something is fairly limited. I'm hoping to work on adding linux DIO support to pg, but I'm sure as hell not going to do be able to do the same on windows (solaris, hpux, aix, ...) etc.</p> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea. Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> <blockquote> <p>In this case, robust IO error reporting is easy with DIO. It's one of the reasons most of the high performance database engines are either using or moving to non-blocking AIO+DIO (RWF_NOWAIT) and use O_DSYNC/RWF_DSYNC for integrity-critical IO dispatch. This is also being driven by the availability of high performance, high IOPS solid state storage where buffering in RAM to optimise IO patterns and throughput provides no real performance benefit.</p> <p>Using the AIO+DIO infrastructure ensures errors are reported for the specific write that fails at failure time (i.e. in the aio completion event for the specific IO), yet high IO throughput can be maintained without the application needing it's own threading infrastructure to prevent blocking.</p> <p>This means the application doesn't have to guess where the write error occurred to retry/recover, have to handle async write errors on close(), have to use fsync() to gather write IO errors and then infer where the IO failure was, or require kernels on every supported platform to jump through hoops to try to do exactly the right thing in error conditions for everyone in all circumstances at all times....</p> </blockquote> <p>Most of that sounds like a good thing to do, but you got to recognize that that's a lot of linux specific code.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Wed, 11 Apr 2018 19:51:13 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-11 19:32:21 -0700, Andres Freund wrote:</p> <blockquote> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea. Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> </blockquote> <p>And before somebody argues that that's a too small window to trigger the problem realistically: Restoring large databases happens pretty commonly (for new replicas, testcases, or actual fatal issues), takes time, and it's where a lot of storage is actually written to for the first time in a while, so it's far from unlikely to trigger bad block errors or such.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Wed, 11 Apr 2018 20:02:48 -0700 </code></pre> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <blockquote> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> </blockquote> <p>At the moment, when we open a file, we sample the current state of the writeback error and only report new errors. We could set it to zero instead, and report the most recent error as soon as anything happens which would report an error. That way err = close(open(&quot;file&quot;)); would report the most recent error.</p> <p>That's not going to be persistent across the data structure for that inode being removed from memory; we'd need filesystem support for persisting that. But maybe it's &quot;good enough&quot; to only support it for recent files.</p> <p>Jeff, what do you think?</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 01:09:24 -0400 </code></pre> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>Most of that sounds like a good thing to do, but you got to recognize that that's a lot of linux specific code.</p> </blockquote> <p>I know it's not what PG has chosen, but realistically all of the other major databases and userspace based storage systems have used DIO precisely <em>because</em> it's the way to avoid OS-specific behavior or require OS-specific code. DIO is simple, and pretty much the same everywhere.</p> <p>In contrast, the exact details of how buffered I/O workrs can be quite different on different OS's. This is especially true if you take performance related details (e.g., the cleaning algorithm, how pages get chosen for eviction, etc.)</p> <p>As I read the PG-hackers thread, I thought I saw acknowledgement that some of the behaviors you don't like with Linux also show up on other Unix or Unix-like systems?</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 01:34:45 -0400 </code></pre> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> </blockquote> <p>If there is no open file descriptor, and in many cases, no process (because it has already exited), it may be horrible, but what the h*ll else do you expect the OS to do?</p> <p>The solution we use at Google is that we watch for I/O errors using a completely different process that is responsible for monitoring machine health. It used to scrape dmesg, but we now arrange to have I/O errors get sent via a netlink channel to the machine health monitoring daemon. If it detects errors on a particular hard drive, it tells the cluster file system to stop using that disk, and to reconstruct from erasure code all of the data chunks on that disk onto other disks in the cluster. We then run a series of disk diagnostics to make sure we find all of the bad sectors (every often, where there is one bad sector, there are several more waiting to be found), and then afterwards, put the disk back into service.</p> <p>By making it be a separate health monitoring process, we can have HDD experts write much more sophisticated code that can ask the disk firmware for more information (e.g., SMART, the grown defect list), do much more careful scrubbing of the disk media, etc., before returning the disk back to service.</p> <blockquote> <blockquote> <p>Everyone already has too much work to do, so you need to find someone who has an interest in fixing this (IMHO very peculiar) use case. If PG developers want to add a tunable &quot;keep dirty pages in RAM on IO failure&quot;, I don't think that it would be too hard for someone to do. It might be harder to convince some of the kernel maintainers to accept it, and I've been on the losing side of that battle more than once. However, like everything you don't pay for, you can't require someone else to do this for you. It wouldn't hurt to see if Jeff Layton, who wrote the errseq patches, would be interested to work on something like this.</p> </blockquote> <p>I don't think this is that PG specific, as explained above.</p> </blockquote> <p>The reality is that recovering from disk errors is tricky business, and I very much doubt most userspace applications, including distro package managers, are going to want to engineer for trying to detect and recover from disk errors. If that were true, then Red Hat and/or SuSE have kernel engineers, and they would have implemented everything everything on your wish list. They haven't, and that should tell you something.</p> <p>The other reality is that once a disk starts developing errors, in reality you will probably need to take the disk off-line, scrub it to find any other media errors, and there's a good chance you'll need to rewrite bad sectors (incluing some which are on top of file system metadata, so you probably will have to run fsck or reformat the whole file system). I certainly don't think it's realistic to assume adding lots of sophistication to each and every userspace program.</p> <p>If you have tens or hundreds of thousands of disk drives, then you will need to do tsomething automated, but I claim that you really don't want to smush all of that detailed exception handling and HDD repair technology into each database or cluster file system component. It really needs to be done in a separate health-monitor and machine-level management system.</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Thu, 12 Apr 2018 15:45:36 +1000 </code></pre> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:</p> <blockquote> <p>To pound the broken record: there are many good reasons why Linux filesystem developers have said &quot;you should use direct IO&quot; to the PG devs each time we have this &quot;the kernel doesn't do &quot; discussion.</p> </blockquote> <p>I personally am on board with doing that. But you also gotta recognize that an efficient DIO usage is a metric ton of work, and you need a large amount of differing logic for different platforms. It's just not realistic to do so for every platform. Postgres is developed by a small number of people, isn't VC backed etc. The amount of resources we can throw at something is fairly limited. I'm hoping to work on adding linux DIO support to pg, but I'm sure as hell not going to do be able to do the same on windows (solaris, hpux, aix, ...) etc.</p> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea.</p> </blockquote> <p>Yes it is.</p> <p>This is what syncfs() is for - making sure a large amount of of data and metadata spread across many files and subdirectories in a single filesystem is pushed to stable storage in the most efficient manner possible.</p> <blockquote> <p>Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> </blockquote> <p>No, Just saying fsyncing individual files and directories is about the most inefficient way you could possible go about doing this.</p> <hr /> <pre><code>From: Lukas Czerner &lt;lczerner@...hat.com&gt; Date: Thu, 12 Apr 2018 12:19:26 +0200 </code></pre> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea. Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> </blockquote> <p>Does not seem like a problem to me, just checksum the thing if you really need to be extra safe. You should probably be doing it anyway if you backup / archive / timetravel / whatnot.</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Thu, 12 Apr 2018 07:09:14 -0400 </code></pre> <p>On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <blockquote> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> </blockquote> </blockquote> <p>What are you expecting to happen in this case? Are you expecting a read error due to a writeback failure? Or are you just saying that we should be invalidating pages that failed to be written back, so that they can be re-read?</p> <blockquote> <p>At the moment, when we open a file, we sample the current state of the writeback error and only report new errors. We could set it to zero instead, and report the most recent error as soon as anything happens which would report an error. That way err = close(open(&quot;file&quot;)); would report the most recent error.</p> <p>That's not going to be persistent across the data structure for that inode being removed from memory; we'd need filesystem support for persisting that. But maybe it's &quot;good enough&quot; to only support it for recent files.</p> <p>Jeff, what do you think?</p> </blockquote> <p>I hate it :). We could do that, but....yecchhhh.</p> <p>Reporting errors only in the case where the inode happened to stick around in the cache seems too unreliable for real-world usage, and might be problematic for some use cases. I'm also not sure it would really be helpful.</p> <p>I think the crux of the matter here is not really about error reporting, per-se. I asked this at LSF last year, and got no real answer:</p> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> <p>Maybe that's ok in the face of a writeback error though? IDK.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Thu, 12 Apr 2018 04:19:48 -0700 </code></pre> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>At the moment, when we open a file, we sample the current state of the writeback error and only report new errors. We could set it to zero instead, and report the most recent error as soon as anything happens which would report an error. That way err = close(open(&quot;file&quot;)); would report the most recent error.</p> <p>That's not going to be persistent across the data structure for that inode being removed from memory; we'd need filesystem support for persisting that. But maybe it's &quot;good enough&quot; to only support it for recent files.</p> <p>Jeff, what do you think?</p> </blockquote> <p>I hate it :). We could do that, but....yecchhhh.</p> <p>Reporting errors only in the case where the inode happened to stick around in the cache seems too unreliable for real-world usage, and might be problematic for some use cases. I'm also not sure it would really be helpful.</p> </blockquote> <p>Yeah, it's definitely half-arsed. We could make further changes to improve the situation, but they'd have wider impact. For example, we can tell if the error has been sampled by any existing fd, so we could bias our inode reaping to have inodes with unreported errors stick around in the cache for longer.</p> <blockquote> <p>I think the crux of the matter here is not really about error reporting, per-se. I asked this at LSF last year, and got no real answer:</p> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>I suspect it isn't. If there's a transient error then we should reattempt the write. OTOH if the error is permanent then reattempting the write isn't going to do any good and it's just going to cause the drive to go through the whole error handling dance again. And what do we do if we're low on memory and need these pages back to avoid going OOM? There's a lot of options here, all of them bad in one situation or another.</p> <blockquote> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> <p>Maybe that's ok in the face of a writeback error though? IDK.</p> </blockquote> <p>I don't know either. It'd force the application to face up to the fact that the data is gone immediately rather than only finding it out after a reboot. Again though that might cause more problems than it solves. It's hard to know what the right thing to do is.</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Thu, 12 Apr 2018 07:24:12 -0400 </code></pre> <p>On Thu, 2018-04-12 at 15:45 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:</p> <blockquote> <p>To pound the broken record: there are many good reasons why Linux filesystem developers have said &quot;you should use direct IO&quot; to the PG devs each time we have this &quot;the kernel doesn't do &quot; discussion.</p> </blockquote> <p>I personally am on board with doing that. But you also gotta recognize that an efficient DIO usage is a metric ton of work, and you need a large amount of differing logic for different platforms. It's just not realistic to do so for every platform. Postgres is developed by a small number of people, isn't VC backed etc. The amount of resources we can throw at something is fairly limited. I'm hoping to work on adding linux DIO support to pg, but I'm sure as hell not going to do be able to do the same on windows (solaris, hpux, aix, ...) etc.</p> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea.</p> </blockquote> <p>Yes it is.</p> <p>This is what syncfs() is for - making sure a large amount of of data and metadata spread across many files and subdirectories in a single filesystem is pushed to stable storage in the most efficient manner possible.</p> </blockquote> <p>Just note that the error return from syncfs is somewhat iffy. It doesn't necessarily return an error when one inode fails to be written back. I think it mainly returns errors when you get a metadata writeback error.</p> <blockquote> <blockquote> <p>Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> </blockquote> <p>No, Just saying fsyncing individual files and directories is about the most inefficient way you could possible go about doing this.</p> </blockquote> <p>You can still use syncfs but what you'd probably have to do is call syncfs while you still hold all of the fd's open, and then fsync each one afterward to ensure that they all got written back properly. That should work as you'd expect.</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Thu, 12 Apr 2018 22:01:22 +1000 </code></pre> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>There isn't a right thing. Whatever we do will be wrong for someone.</p> <blockquote> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> </blockquote> <p>Not to mention a POSIX IO ordering violation. Seeing stale data after a &quot;successful&quot; write is simply not allowed.</p> <blockquote> <p>Maybe that's ok in the face of a writeback error though? IDK.</p> </blockquote> <p>No matter what we do for async writeback error handling, it will be slightly different from filesystem to filesystem, not to mention OS to OS. The is no magic bullet here, so I'm not sure we should worry too much. There's direct IO for anyone who cares that need to know about the completion status of every single write IO....</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 11:16:46 -0400 </code></pre> <p>On Thu, Apr 12, 2018 at 10:01:22PM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>There isn't a right thing. Whatever we do will be wrong for someone.</p> </blockquote> <p>That's the problem. The best that could be done (and it's not enough) would be to have a mode which does with the PG folks want (or what they <em>think</em> they want). It seems what they want is to have an error result in the page being marked clean. When they discover the outcome (OOM-city and the unability to unmount a file system on a failed drive), then they will complain to us <em>again</em>, at which point we can tell them that want they really want is another variation on O_PONIES, and welcome to the real world and real life.</p> <p>Which is why, even if they were to pay someone to implement what they want, I'm not sure we would want to accept it upstream --- or distro's might consider it a support nightmare, and refuse to allow that mode to be enabled on enterprise distro's. But at least, it will have been some PG-based company who will have implemented it, so they're not wasting other people's time or other people's resources...</p> <p>We could try to get something like what Google is doing upstream, which is to have the I/O errors sent to userspace via a netlink channel (without changing anything else about how buffered writeback is handled in the face of errors). Then userspace applications could switch to Direct I/O like all of the other really serious userspace storage solutions I'm aware of, and then someone could try to write some kind of HDD health monitoring system that tries to do the right thing when a disk is discovered to have developed some media errors or something more serious (e.g., a head failure). That plus some kind of RAID solution is I think the only thing which is really realistic for a typical PG site.</p> <p>It's certainly that's what <em>I</em> would do if I didn't decide to use a hosted cloud solution, such as Cloud SQL for Postgres, and let someone else solve the really hard problems of dealing with real-world HDD failures. :-)</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Thu, 12 Apr 2018 11:08:50 -0400 </code></pre> <p>On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>There isn't a right thing. Whatever we do will be wrong for someone.</p> <blockquote> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> </blockquote> <p>Not to mention a POSIX IO ordering violation. Seeing stale data after a &quot;successful&quot; write is simply not allowed.</p> </blockquote> <p>I'm not so sure here, given that we're dealing with an error condition. Are we really obligated not to allow any changes to pages that we can't write back?</p> <p>Given that the pages are clean after these failures, we aren't doing this even today:</p> <p>Suppose we're unable to do writes but can do reads vs. the backing store. After a wb failure, the page has the dirty bit cleared. If it gets kicked out of the cache before the read occurs, it'll have to be faulted back in. Poof -- your write just disappeared.</p> <p>That can even happen before you get the chance to call fsync, so even a write()+read()+fsync() is not guaranteed to be safe in this regard today, given sufficient memory pressure.</p> <p>I think the current situation is fine from a &quot;let's not OOM at all costs&quot; standpoint, but not so good for application predictability. We should really consider ways to do better here.</p> <blockquote> <blockquote> <p>Maybe that's ok in the face of a writeback error though? IDK.</p> </blockquote> <p>No matter what we do for async writeback error handling, it will be slightly different from filesystem to filesystem, not to mention OS to OS. The is no magic bullet here, so I'm not sure we should worry too much. There's direct IO for anyone who cares that need to know about the completion status of every single write IO....</p> </blockquote> <p>I think we we have an opportunity here to come up with better defined and hopefully more useful behavior for buffered I/O in the face of writeback errors. The first step would be to hash out what we'd want it to look like.</p> <p>Maybe we need a plenary session at LSF/MM?</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 12:46:27 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-12 12:19:26 +0200, Lukas Czerner wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea. Or even just cp -r ing it, and then starting up a copy of the database. What you're saying is that none of that is doable in a safe way, unless you use special-case DIO using tooling for the whole operation (or at least tools that fsync carefully without ever closing a fd, which certainly isn't the case for cp et al).</p> </blockquote> <p>Does not seem like a problem to me, just checksum the thing if you really need to be extra safe. You should probably be doing it anyway if you backup / archive / timetravel / whatnot.</p> </blockquote> <p>That doesn't really help, unless you want to sync() and then re-read all the data to make sure it's the same. Rereading multi-TB backups just to know whether there was an error that the OS knew about isn't particularly fun. Without verifying after sync it's not going to improve the situation measurably, you're still only going to discover that $data isn't available when it's needed.</p> <p>What you're saying here is that there's no way to use standard linux tools to manipulate files and know whether it failed, without filtering kernel logs for IO errors. Or am I missing something?</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 12:55:36 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-12 01:34:45 -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>The solution we use at Google is that we watch for I/O errors using a completely different process that is responsible for monitoring machine health. It used to scrape dmesg, but we now arrange to have I/O errors get sent via a netlink channel to the machine health monitoring daemon.</p> </blockquote> <p>Any pointers to that the underling netlink mechanism? If we can force postgres to kill itself when such an error is detected (via a dedicated monitoring process), I'd personally be happy enough. It'd be nicer if we could associate that knowledge with particular filesystems etc (which'd possibly hard through dm etc?), but this'd be much better than nothing.</p> <blockquote> <p>The reality is that recovering from disk errors is tricky business, and I very much doubt most userspace applications, including distro package managers, are going to want to engineer for trying to detect and recover from disk errors. If that were true, then Red Hat and/or SuSE have kernel engineers, and they would have implemented everything everything on your wish list. They haven't, and that should tell you something.</p> </blockquote> <p>The problem really isn't about <em>recovering</em> from disk errors. <em>Knowing</em> about them is the crucial part. We do not want to give back clients the information that an operation succeeded, when it actually didn't. There could be improvements above that, but as long as it's guaranteed that &quot;we&quot; get the error (rather than just some kernel log we don't have access to, which looks different due to config etc), it's ok. We can throw our hands up in the air and give up.</p> <blockquote> <p>The other reality is that once a disk starts developing errors, in reality you will probably need to take the disk off-line, scrub it to find any other media errors, and there's a good chance you'll need to rewrite bad sectors (incluing some which are on top of file system metadata, so you probably will have to run fsck or reformat the whole file system). I certainly don't think it's realistic to assume adding lots of sophistication to each and every userspace program.</p> <p>If you have tens or hundreds of thousands of disk drives, then you will need to do tsomething automated, but I claim that you really don't want to smush all of that detailed exception handling and HDD repair technology into each database or cluster file system component. It really needs to be done in a separate health-monitor and machine-level management system.</p> </blockquote> <p>Yea, agreed on all that. I don't think anybody actually involved in postgres wants to do anything like that. Seems far outside of postgres' remit.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 13:13:22 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-12 11:16:46 -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>That's the problem. The best that could be done (and it's not enough) would be to have a mode which does with the PG folks want (or what they <em>think</em> they want). It seems what they want is to have an error result in the page being marked clean. When they discover the outcome (OOM-city and the unability to unmount a file system on a failed drive), then they will complain to us <em>again</em>, at which point we can tell them that want they really want is another variation on O_PONIES, and welcome to the real world and real life.</p> </blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient. I don't see that that'd realistically would trigger OOM or the inability to unmount a filesystem. If the drive is entirely gone there's obviously no point in keeping per-file information around, so per-blockdev/fs information suffices entirely to return an error on fsync (which at least on ext4 appears to happen if the underlying blockdev is gone).</p> <p>Have fun making up things we want, but I'm not sure it's particularly productive.</p> <blockquote> <p>Which is why, even if they were to pay someone to implement what they want, I'm not sure we would want to accept it upstream --- or distro's might consider it a support nightmare, and refuse to allow that mode to be enabled on enterprise distro's. But at least, it will have been some PG-based company who will have implemented it, so they're not wasting other people's time or other people's resources...</p> </blockquote> <p>Well, that's why I'm discussing here so we can figure out what's acceptable before considering wasting money and revew cycles doing or paying somebody to do some crazy useless shit.</p> <blockquote> <p>We could try to get something like what Google is doing upstream, which is to have the I/O errors sent to userspace via a netlink channel (without changing anything else about how buffered writeback is handled in the face of errors).</p> </blockquote> <p>Ah, darn. After you'd mentioned that in an earlier mail I'd hoped that'd be upstream. And yes, that'd be perfect.</p> <blockquote> <p>Then userspace applications could switch to Direct I/O like all of the other really serious userspace storage solutions I'm aware of, and then someone could try to write some kind of HDD health monitoring system that tries to do the right thing when a disk is discovered to have developed some media errors or something more serious (e.g., a head failure). That plus some kind of RAID solution is I think the only thing which is really realistic for a typical PG site.</p> </blockquote> <p>As I said earlier, I think there's good reason to move to DIO for postgres. But to keep that performant is going to need some serious work.</p> <p>But afaict such a solution wouldn't really depend on applications using DIO or not. Before finishing a checkpoint (logging it persistently and allowing to throw older data away), we could check if any errors have been reported and give up if there have been any. And after starting postgres on a directory restored from backup using $tool, we can fsync the directory recursively, check for such errors, and give up if there've been any.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 13:24:57 -0700 </code></pre> <p>On 2018-04-12 07:09:14 -0400, Jeff Layton wrote:</p> <blockquote> <p>On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <blockquote> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> </blockquote> </blockquote> <p>What are you expecting to happen in this case? Are you expecting a read error due to a writeback failure? Or are you just saying that we should be invalidating pages that failed to be written back, so that they can be re-read?</p> </blockquote> <p>Yes, I'd hope for a read error after a writeback failure. I think that's sane behaviour. But I don't really care <em>that</em> much.</p> <p>At the very least <em>some</em> way to <em>know</em> that such a failure occurred from userland without having to parse the kernel log. As far as I understand, neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to report an error if it was encountered by writeback in the background.</p> <p>If that's indeed true for syncfs(2), even if the fd has been opened before (which I can see how it could happen from an implementation POV, nothing would associate a random FD with failures on different files), it's really impossible to detect this stuff from userland without text parsing.</p> <p>Even if it'd were just a perf-fs /sys/$something file that'd return the current count of unreported errors in a filesystem independent way, it'd be better than what we have right now.</p> <pre><code>1) figure out /sys/$whatnot $directory belongs to 2) oldcount=$(cat /sys/$whatnot/unreported_errors) 3) filesystem operations in $directory 4) sync;sync; 5) newcount=$(cat /sys/$whatnot/unreported_errors) 6) test &quot;$oldcount&quot; -eq &quot;$newcount&quot; || die-with-horrible-message </code></pre> <p>Isn't beautiful to script, but it's also not absolutely terrible.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Thu, 12 Apr 2018 13:28:30 -0700 </code></pre> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per-superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> <blockquote> <p>I don't see that that'd realistically would trigger OOM or the inability to unmount a filesystem.</p> </blockquote> <p>Ted's referring to the current state of affairs where the writeback error is held in the inode; if we can't evict the inode because it's holding the error indicator, that can send us OOM. If instead we transfer the error indicator to the superblock, then there's no problem.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 14:11:45 -0700 </code></pre> <p>On 2018-04-12 07:24:12 -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 15:45 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-12 10:09:16 +1000, Dave Chinner wrote: &gt; To pound the broken record: there are many good reasons why Linux &gt; filesystem developers have said &quot;you should use direct IO&quot; to the PG &gt; devs each time we have this &quot;the kernel doesn't do PG needs&gt;&quot; discussion.</p> <p>I personally am on board with doing that. But you also gotta recognize that an efficient DIO usage is a metric ton of work, and you need a large amount of differing logic for different platforms. It's just not realistic to do so for every platform. Postgres is developed by a small number of people, isn't VC backed etc. The amount of resources we can throw at something is fairly limited. I'm hoping to work on adding linux DIO support to pg, but I'm sure as hell not going to do be able to do the same on windows (solaris, hpux, aix, ...) etc.</p> <p>And there's cases where that just doesn't help at all. Being able to untar a database from backup / archive / timetravel / whatnot, and then fsyncing the directory tree to make sure it's actually safe, is really not an insane idea.</p> </blockquote> <p>Yes it is.</p> <p>This is what syncfs() is for - making sure a large amount of of data and metadata spread across many files and subdirectories in a single filesystem is pushed to stable storage in the most efficient manner possible.</p> </blockquote> </blockquote> <p>syncfs isn't standardized, it operates on an entire filesystem (thus writing out unnecessary stuff), it has no meaningful documentation of it's return codes. Yes, using syncfs() might better performancewise, but it doesn't seem like it actually solves anything, performance aside:</p> <blockquote> <p>Just note that the error return from syncfs is somewhat iffy. It doesn't necessarily return an error when one inode fails to be written back. I think it mainly returns errors when you get a metadata writeback error.</p> <p>You can still use syncfs but what you'd probably have to do is call syncfs while you still hold all of the fd's open, and then fsync each one afterward to ensure that they all got written back properly. That should work as you'd expect.</p> </blockquote> <p>Which again doesn't allow one to use any non-bespoke tooling (like tar or whatnot). And it means you'll have to call syncfs() every few hundred files, because you'll obviously run into filehandle limitations.</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Thu, 12 Apr 2018 17:14:54 -0400 </code></pre> <p>On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per- superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>Not a bad idea and shouldn't be too costly. mapping_set_error could flag the superblock one before or after the one in the mapping.</p> <p>We'd need to define what happens if you interleave fsync and syncfs calls on the same inode though. How do we handle file-&gt;f_wb_err in that case? Would we need a second field in struct file to act as the per-sb error cursor?</p> <blockquote> <blockquote> <p>I don't see that that'd realistically would trigger OOM or the inability to unmount a filesystem.</p> </blockquote> <p>Ted's referring to the current state of affairs where the writeback error is held in the inode; if we can't evict the inode because it's holding the error indicator, that can send us OOM. If instead we transfer the error indicator to the superblock, then there's no problem.</p> </blockquote> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 17:21:44 -0400 </code></pre> <p>On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per-superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>When or how would the per-superblock wb_err flag get cleared?</p> <p>Would all subsequent fsync() calls on that file system now return EIO? Or would only all subsequent syncfs() calls return EIO?</p> <blockquote> <blockquote> <p>I don't see that that'd realistically would trigger OOM or the inability to unmount a filesystem.</p> </blockquote> <p>Ted's referring to the current state of affairs where the writeback error is held in the inode; if we can't evict the inode because it's holding the error indicator, that can send us OOM. If instead we transfer the error indicator to the superblock, then there's no problem.</p> </blockquote> <p>Actually, I was referring to the pg-hackers original ask, which was that after an error, all of the dirty pages that couldn't be written out would stay dirty.</p> <p>If it's only as single inode which is pinned in memory with the dirty flag, that's bad, but it's not as bad as pinning all of the memory pages for which there was a failed write. We would still need to invent some mechanism or define some semantic when it would be OK to clear the per-inode flag and let the memory associated with that pinned inode get released, though.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Thu, 12 Apr 2018 14:24:32 -0700 </code></pre> <p>On Thu, Apr 12, 2018 at 05:21:44PM -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per-superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>When or how would the per-superblock wb_err flag get cleared?</p> </blockquote> <p>That's not how errseq works, Ted ;-)</p> <blockquote> <p>Would all subsequent fsync() calls on that file system now return EIO? Or would only all subsequent syncfs() calls return EIO?</p> </blockquote> <p>Only ones which occur after the last sampling get reported through this particular file descriptor.</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Thu, 12 Apr 2018 17:27:54 -0400 </code></pre> <p>On Thu, 2018-04-12 at 13:24 -0700, Andres Freund wrote:</p> <blockquote> <p>On 2018-04-12 07:09:14 -0400, Jeff Layton wrote:</p> <blockquote> <p>On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <blockquote> <p>While there's some differing opinions on the referenced postgres thread, the fundamental problem isn't so much that a retry won't fix the problem, it's that we might NEVER see the failure. If writeback happens in the background, encounters an error, undirties the buffer, we will happily carry on because we've never seen that. That's when we're majorly screwed.</p> </blockquote> <p>I think there are two issues here - &quot;fsync() on an fd that was just opened&quot; and &quot;persistent error state (without keeping dirty pages in memory)&quot;.</p> <p>If there is background data writeback <em>without an open file descriptor</em>, there is no mechanism for the kernel to return an error to any application which may exist, or may not ever come back.</p> </blockquote> <p>And that's <em>horrible</em>. If I cp a file, and writeback fails in the background, and I then cat that file before restarting, I should be able to see that that failed. Instead of returning something bogus.</p> </blockquote> </blockquote> <p>What are you expecting to happen in this case? Are you expecting a read error due to a writeback failure? Or are you just saying that we should be invalidating pages that failed to be written back, so that they can be re-read?</p> </blockquote> <p>Yes, I'd hope for a read error after a writeback failure. I think that's sane behaviour. But I don't really care <em>that</em> much.</p> </blockquote> <p>I'll have to respectfully disagree. Why should I interpret an error on a read() syscall to mean that writeback failed? Note that the data is still potentially intact.</p> <p>What <em>might</em> make sense, IMO, is to just invalidate the pages that failed to be written back. Then you could potentially do a read to fault them in again (i.e. sync the pagecache and the backing store) and possibly redirty them for another try.</p> <p>Note that you can detect this situation by checking the return code from fsync. It should report the latest error once per file description.</p> <blockquote> <p>At the very least <em>some</em> way to <em>know</em> that such a failure occurred from userland without having to parse the kernel log. As far as I understand, neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to report an error if it was encountered by writeback in the background.</p> <p>If that's indeed true for syncfs(2), even if the fd has been opened before (which I can see how it could happen from an implementation POV, nothing would associate a random FD with failures on different files), it's really impossible to detect this stuff from userland without text parsing.</p> </blockquote> <p>syncfs could use some work.</p> <p>I'm warming to willy's idea to add a per-sb errseq_t. I think that might be a simple way to get better semantics here. Not sure how we want to handle the reporting end yet though...</p> <p>We probably also need to consider how to better track metadata writeback errors (on e.g. ext2). We don't really do that properly at quite yet either.</p> <blockquote> <p>Even if it'd were just a perf-fs /sys/$something file that'd return the current count of unreported errors in a filesystem independent way, it'd be better than what we have right now.</p> <p>1) figure out /sys/$whatnot $directory belongs to 2) oldcount=$(cat /sys/$whatnot/unreported_errors) 3) filesystem operations in $directory 4) sync;sync; 5) newcount=$(cat /sys/$whatnot/unreported_errors) 6) test &quot;$oldcount&quot; -eq &quot;$newcount&quot; || die-with-horrible-message</p> <p>Isn't beautiful to script, but it's also not absolutely terrible.</p> </blockquote> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Thu, 12 Apr 2018 14:31:10 -0700 </code></pre> <p>On Thu, Apr 12, 2018 at 05:14:54PM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per- superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>Not a bad idea and shouldn't be too costly. mapping_set_error could flag the superblock one before or after the one in the mapping.</p> <p>We'd need to define what happens if you interleave fsync and syncfs calls on the same inode though. How do we handle file-&gt;f_wb_err in that case? Would we need a second field in struct file to act as the per-sb error cursor?</p> </blockquote> <p>Ooh. I hadn't thought that through. Bleh. I don't want to add a field to struct file for this uncommon case.</p> <p>Maybe O_PATH could be used for this? It gets you a file descriptor on a particular filesystem, so syncfs() is defined, but it can't report a writeback error. So if you open something O_PATH, you can use the file's f_wb_err for the mapping's error cursor.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 14:37:56 -0700 </code></pre> <p>On 2018-04-12 17:21:44 -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per-superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>When or how would the per-superblock wb_err flag get cleared?</p> </blockquote> <p>I don't think unmount + resettable via /sys would be an insane approach. Requiring explicit action to acknowledge data loss isn't a crazy concept. But I think that's something reasonable minds could disagree with.</p> <blockquote> <p>Would all subsequent fsync() calls on that file system now return EIO? Or would only all subsequent syncfs() calls return EIO?</p> </blockquote> <p>If it were tied to syncfs, I wonder if there's a way to have some errseq type logic. Store a per superblock (or whatever equivalent thing) errseq value of errors. For each fd calling syncfs() report the error once, but then store the current value in a separate per-fd field. And if that's considered too weird, only report the errors to fds that have been opened from before the error occurred.</p> <p>I can see writing a tool 'pg_run_and_sync /directo /ries -- command' which opens an fd for each of the filesystems the directories reside on, and calls syncfs() after. That'd allow to use backup/restore tools at least semi safely.</p> <blockquote> <blockquote> <blockquote> <p>I don't see that that'd realistically would trigger OOM or the inability to unmount a filesystem.</p> </blockquote> <p>Ted's referring to the current state of affairs where the writeback error is held in the inode; if we can't evict the inode because it's holding the error indicator, that can send us OOM. If instead we transfer the error indicator to the superblock, then there's no problem.</p> </blockquote> <p>Actually, I was referring to the pg-hackers original ask, which was that after an error, all of the dirty pages that couldn't be written out would stay dirty.</p> </blockquote> <p>Well, it's an open list, everyone can argue. And initially people at first didn't know the OOM explanation, and then it takes some time to revise ones priors :). I think it's a design question that reasonable people can disagree upon (if &quot;hot&quot; removed devices are handled by throwing data away regardless, at least). But as it's clearly not something viable, we can move on to something that can solve the problem.</p> <blockquote> <p>If it's only as single inode which is pinned in memory with the dirty flag, that's bad, but it's not as bad as pinning all of the memory pages for which there was a failed write. We would still need to invent some mechanism or define some semantic when it would be OK to clear the per-inode flag and let the memory associated with that pinned inode get released, though.</p> </blockquote> <p>Yea, I agree that that's not obvious. One way would be to say that it's only automatically cleared when you unlink the file. A bit heavyhanded, but not too crazy.</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 17:52:52 -0400 </code></pre> <p>On Thu, Apr 12, 2018 at 12:55:36PM -0700, Andres Freund wrote:</p> <blockquote> <p>Any pointers to that the underling netlink mechanism? If we can force postgres to kill itself when such an error is detected (via a dedicated monitoring process), I'd personally be happy enough. It'd be nicer if we could associate that knowledge with particular filesystems etc (which'd possibly hard through dm etc?), but this'd be much better than nothing.</p> </blockquote> <p>Yeah, sorry, it never got upstreamed. It's not really all that complicated, it was just that there were some other folks who wanted to do something similar, and there was a round of bike-sheddingh several years ago, and nothing ever went upstream. Part of the problem was that our orignial scheme sent up information about file system-level corruption reports --- e.g, those stemming from calls to ext4_error() --- and lots of people had different ideas about how tot get all of the possible information up in some structured format. (Think something like uerf from Digtial's OSF/1.)</p> <p>We did something <em>really</em> simple/stupid. We just sent essentially an ascii test string out the netlink socket. That's because what we were doing before was essentially scraping the output of dmesg (e.g. /dev/kmssg).</p> <p>That's actually probably the simplest thing to do, and it has the advantage that it will work even on ancient enterprise kernels that PG users are likely to want to use. So you will need to implement the dmesg text scraper anyway, and that's probably good enough for most use cases.</p> <blockquote> <p>The problem really isn't about <em>recovering</em> from disk errors. <em>Knowing</em> about them is the crucial part. We do not want to give back clients the information that an operation succeeded, when it actually didn't. There could be improvements above that, but as long as it's guaranteed that &quot;we&quot; get the error (rather than just some kernel log we don't have access to, which looks different due to config etc), it's ok. We can throw our hands up in the air and give up.</p> </blockquote> <p>Right, it's a little challenging because the actual regexp's you would need to use do vary from device driver to device driver. Fortunately nearly everything is a SCSI/SATA device these days, so there isn't <em>that</em> much variability.</p> <blockquote> <p>Yea, agreed on all that. I don't think anybody actually involved in postgres wants to do anything like that. Seems far outside of postgres' remit.</p> </blockquote> <p>Some people on the pg-hackers list were talking about wanting to retry the fsync() and hoping that would cause the write to somehow suceed. It's <em>possible</em> that might help, but it's not likely to be helpful in my experience.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 14:53:19 -0700 </code></pre> <p>On 2018-04-12 17:27:54 -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 13:24 -0700, Andres Freund wrote:</p> <blockquote> <p>At the very least <em>some</em> way to <em>know</em> that such a failure occurred from userland without having to parse the kernel log. As far as I understand, neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to report an error if it was encountered by writeback in the background.</p> <p>If that's indeed true for syncfs(2), even if the fd has been opened before (which I can see how it could happen from an implementation POV, nothing would associate a random FD with failures on different files), it's really impossible to detect this stuff from userland without text parsing.</p> </blockquote> <p>syncfs could use some work.</p> </blockquote> <p>It's really too bad that it doesn't have a flags argument.</p> <blockquote> <p>We probably also need to consider how to better track metadata writeback errors (on e.g. ext2). We don't really do that properly at quite yet either.</p> <blockquote> <p>Even if it'd were just a perf-fs /sys/$something file that'd return the current count of unreported errors in a filesystem independent way, it'd be better than what we have right now.</p> <p>1) figure out /sys/$whatnot $directory belongs to 2) oldcount=$(cat /sys/$whatnot/unreported_errors) 3) filesystem operations in $directory 4) sync;sync; 5) newcount=$(cat /sys/$whatnot/unreported_errors) 6) test &quot;$oldcount&quot; -eq &quot;$newcount&quot; || die-with-horrible-message</p> <p>Isn't beautiful to script, but it's also not absolutely terrible.</p> </blockquote> </blockquote> <p>ext4 seems to have something roughly like that (/sys/fs/ext4/$dev/errors_count), and by my reading it already seems to be incremented from the necessary places. By my reading XFS doesn't seem to have something similar.</p> <p>Wouldn't be bad to standardize...</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 12 Apr 2018 17:57:56 -0400 </code></pre> <p>On Thu, Apr 12, 2018 at 02:53:19PM -0700, Andres Freund wrote:</p> <blockquote> <blockquote> <blockquote> <p>Isn't beautiful to script, but it's also not absolutely terrible.</p> </blockquote> </blockquote> <p>ext4 seems to have something roughly like that (/sys/fs/ext4/$dev/errors_count), and by my reading it already seems to be incremented from the necessary places.</p> </blockquote> <p>This is only for file system inconsistencies noticed by the kernel. We don't bump that count for data block I/O errors.</p> <p>The same idea could be used on a block device level. It would be pretty simple to maintain a counter for I/O errors, and when the last error was detected on a particular device. You could evne break out and track read errors and write errors eparately if that would be useful.</p> <p>If you don't care what block was bad, but just that <em>some</em> I/O error had happened, a counter is definitely the simplest approach, and less hair to implemnet and use than something like a netlink channel or scraping dmesg....</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Thu, 12 Apr 2018 15:03:59 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-12 17:52:52 -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>We did something <em>really</em> simple/stupid. We just sent essentially an ascii test string out the netlink socket. That's because what we were doing before was essentially scraping the output of dmesg (e.g. /dev/kmssg).</p> <p>That's actually probably the simplest thing to do, and it has the advantage that it will work even on ancient enterprise kernels that PG users are likely to want to use. So you will need to implement the dmesg text scraper anyway, and that's probably good enough for most use cases.</p> </blockquote> <p>The worst part of that is, as you mention below, needing to handle a lot of different error message formats. I guess it's reasonable enough if you control your hardware, but no such luck.</p> <p>Aren't there quite realistic scenarios where one could miss kmsg style messages due to it being a ringbuffer?</p> <blockquote> <p>Right, it's a little challenging because the actual regexp's you would need to use do vary from device driver to device driver. Fortunately nearly everything is a SCSI/SATA device these days, so there isn't <em>that</em> much variability.</p> </blockquote> <p>There's also SAN / NAS type stuff - not all of that presents as a SCSI/SATA device, right?</p> <blockquote> <blockquote> <p>Yea, agreed on all that. I don't think anybody actually involved in postgres wants to do anything like that. Seems far outside of postgres' remit.</p> </blockquote> <p>Some people on the pg-hackers list were talking about wanting to retry the fsync() and hoping that would cause the write to somehow suceed. It's <em>possible</em> that might help, but it's not likely to be helpful in my experience.</p> </blockquote> <p>Depends on the type of error and storage. ENOSPC, especially over NFS, has some reasonable chances of being cleared up. And for networked block storage it's also not impossible to think of scenarios where that'd work for EIO.</p> <p>But I think besides hope of clearing up itself, it has the advantage that it trivially can give <em>some</em> feedback to the user. The user'll get back strerror(ENOSPC) with some decent SQL error code, which'll hopefully cause them to investigate (well, once monitoring detects high error rates). It's much nicer for the user to type COMMIT; get an appropriate error back etc, than if the database just commits suicide.</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Fri, 13 Apr 2018 08:44:04 +1000 </code></pre> <p>On Thu, Apr 12, 2018 at 11:08:50AM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>There isn't a right thing. Whatever we do will be wrong for someone.</p> <blockquote> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> </blockquote> <p>Not to mention a POSIX IO ordering violation. Seeing stale data after a &quot;successful&quot; write is simply not allowed.</p> </blockquote> <p>I'm not so sure here, given that we're dealing with an error condition. Are we really obligated not to allow any changes to pages that we can't write back?</p> </blockquote> <p>Posix says this about write():</p> <pre><code> After a write() to a regular file has successfully returned: Any successful read() from each byte position in the file that was modified by that write shall return the data specified by the write() for that position until such byte positions are again modified. </code></pre> <p>IOWs, even if there is a later error, we told the user the write was successful, and so according to POSIX we are not allowed to wind back the data to what it was before the write() occurred.</p> <blockquote> <p>Given that the pages are clean after these failures, we aren't doing this even today:</p> <p>Suppose we're unable to do writes but can do reads vs. the backing store. After a wb failure, the page has the dirty bit cleared. If it gets kicked out of the cache before the read occurs, it'll have to be faulted back in. Poof -- your write just disappeared.</p> </blockquote> <p>Yes - I was pointing out what the specification we supposedly conform to says about this behaviour, not that our current behaviour conforms to the spec. Indeed, have you even noticed xfs_aops_discard_page() and it's surrounding context on page writeback submission errors?</p> <p>To save you looking, XFS will trash the page contents completely on a filesystem level -&gt;writepage error. It doesn't mark them &quot;clean&quot;, doesn't attempt to redirty and rewrite them - it clears the uptodate state and may invalidate it completely. IOWs, the data written &quot;sucessfully&quot; to the cached page is now gone. It will be re-read from disk on the next read() call, in direct violation of the above POSIX requirements.</p> <p>This is my point: we've done that in XFS knowing that we violate POSIX specifications in this specific corner case - it's the lesser of many evils we have to chose between. Hence if we chose to encode that behaviour as the general writeback IO error handling algorithm, then it needs to done with the knowledge it is a specification violation. Not to mention be documented as a POSIX violation in the various relevant man pages and that this is how all filesystems will behave on async writeback error.....</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Fri, 13 Apr 2018 08:56:38 -0400 </code></pre> <p>On Thu, 2018-04-12 at 14:31 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 05:14:54PM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:</p> <blockquote> <p>I think a per-file or even per-blockdev/fs error state that'd be returned by fsync() would be more than sufficient.</p> </blockquote> <p>Ah; this was my suggestion to Jeff on IRC. That we add a per- superblock wb_err and then allow syncfs() to return it. So you'd open an fd on a directory (for example), and call syncfs() which would return -EIO or -ENOSPC if either of those conditions had occurred since you opened the fd.</p> </blockquote> <p>Not a bad idea and shouldn't be too costly. mapping_set_error could flag the superblock one before or after the one in the mapping.</p> <p>We'd need to define what happens if you interleave fsync and syncfs calls on the same inode though. How do we handle file-&gt;f_wb_err in that case? Would we need a second field in struct file to act as the per-sb error cursor?</p> </blockquote> <p>Ooh. I hadn't thought that through. Bleh. I don't want to add a field to struct file for this uncommon case.</p> <p>Maybe O_PATH could be used for this? It gets you a file descriptor on a particular filesystem, so syncfs() is defined, but it can't report a writeback error. So if you open something O_PATH, you can use the file's f_wb_err for the mapping's error cursor.</p> </blockquote> <p>That might work.</p> <p>It'd be a syscall behavioral change so we'd need to document that well. It's probably innocuous though -- I doubt we have a lot of callers in the field opening files with O_PATH and calling syncfs on them.</p> <hr /> <pre><code>From: Jeff Layton &lt;jlayton@...hat.com&gt; Date: Fri, 13 Apr 2018 09:18:56 -0400 </code></pre> <p>On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 11:08:50AM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:</p> <blockquote> <p>On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:</p> <blockquote> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> </blockquote> <p>There isn't a right thing. Whatever we do will be wrong for someone.</p> <blockquote> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> </blockquote> <p>Not to mention a POSIX IO ordering violation. Seeing stale data after a &quot;successful&quot; write is simply not allowed.</p> </blockquote> <p>I'm not so sure here, given that we're dealing with an error condition. Are we really obligated not to allow any changes to pages that we can't write back?</p> </blockquote> <p>Posix says this about write():</p> <p>After a write() to a regular file has successfully returned:</p> <pre><code> Any successful read() from each byte position in the file that was modified by that write shall return the data specified by the write() for that position until such byte positions are again modified. </code></pre> <p>IOWs, even if there is a later error, we told the user the write was successful, and so according to POSIX we are not allowed to wind back the data to what it was before the write() occurred.</p> <blockquote> <p>Given that the pages are clean after these failures, we aren't doing this even today:</p> <p>Suppose we're unable to do writes but can do reads vs. the backing store. After a wb failure, the page has the dirty bit cleared. If it gets kicked out of the cache before the read occurs, it'll have to be faulted back in. Poof -- your write just disappeared.</p> </blockquote> <p>Yes - I was pointing out what the specification we supposedly conform to says about this behaviour, not that our current behaviour conforms to the spec. Indeed, have you even noticed xfs_aops_discard_page() and it's surrounding context on page writeback submission errors?</p> <p>To save you looking, XFS will trash the page contents completely on a filesystem level -&gt;writepage error. It doesn't mark them &quot;clean&quot;, doesn't attempt to redirty and rewrite them - it clears the uptodate state and may invalidate it completely. IOWs, the data written &quot;sucessfully&quot; to the cached page is now gone. It will be re-read from disk on the next read() call, in direct violation of the above POSIX requirements.</p> <p>This is my point: we've done that in XFS knowing that we violate POSIX specifications in this specific corner case - it's the lesser of many evils we have to chose between. Hence if we chose to encode that behaviour as the general writeback IO error handling algorithm, then it needs to done with the knowledge it is a specification violation. Not to mention be documented as a POSIX violation in the various relevant man pages and that this is how all filesystems will behave on async writeback error.....</p> </blockquote> <p>Got it, thanks.</p> <p>Yes, I think we ought to probably do the same thing globally. It's nice to know that xfs has already been doing this. That makes me feel better about making this behavior the gold standard for Linux filesystems.</p> <p>So to summarize, at this point in the discussion, I think we want to consider doing the following:</p> <ul> <li>better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.</li> <li>invalidate or clear uptodate flag on pages that experience writeback errors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.</li> </ul> <p>Did I miss anything? Would that be enough to help the Pg usecase?</p> <p>I don't see us ever being able to reasonably support its current expectation that writeback errors will be seen on fd's that were opened after the error occurred. That's a really thorny problem from an object lifetime perspective.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Fri, 13 Apr 2018 06:25:35 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-13 09:18:56 -0400, Jeff Layton wrote:</p> <blockquote> <p>Yes, I think we ought to probably do the same thing globally. It's nice to know that xfs has already been doing this. That makes me feel better about making this behavior the gold standard for Linux filesystems.</p> <p>So to summarize, at this point in the discussion, I think we want to consider doing the following:</p> <ul> <li>better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.</li> <li>invalidate or clear uptodate flag on pages that experience writeback errors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.</li> </ul> <p>Did I miss anything? Would that be enough to help the Pg usecase?</p> <p>I don't see us ever being able to reasonably support its current expectation that writeback errors will be seen on fd's that were opened after the error occurred. That's a really thorny problem from an object lifetime perspective.</p> </blockquote> <p>It's not perfect, but I think the amount of hacky OS specific code should be acceptable. And it does allow for a wrapper tool that can be used around backup restores etc to syncfs all the necessary filesystems. Let me mull with others for a bit.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Fri, 13 Apr 2018 07:02:32 -0700 </code></pre> <p>On Fri, Apr 13, 2018 at 09:18:56AM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:</p> <blockquote> <p>To save you looking, XFS will trash the page contents completely on a filesystem level -&gt;writepage error. It doesn't mark them &quot;clean&quot;, doesn't attempt to redirty and rewrite them - it clears the uptodate state and may invalidate it completely. IOWs, the data written &quot;sucessfully&quot; to the cached page is now gone. It will be re-read from disk on the next read() call, in direct violation of the above POSIX requirements.</p> <p>This is my point: we've done that in XFS knowing that we violate POSIX specifications in this specific corner case - it's the lesser of many evils we have to chose between. Hence if we chose to encode that behaviour as the general writeback IO error handling algorithm, then it needs to done with the knowledge it is a specification violation. Not to mention be documented as a POSIX violation in the various relevant man pages and that this is how all filesystems will behave on async writeback error.....</p> </blockquote> <p>Got it, thanks.</p> <p>Yes, I think we ought to probably do the same thing globally. It's nice to know that xfs has already been doing this. That makes me feel better about making this behavior the gold standard for Linux filesystems.</p> <p>So to summarize, at this point in the discussion, I think we want to consider doing the following:</p> <ul> <li>better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.</li> <li>invalidate or clear uptodate flag on pages that experience writebackerrors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.</li> </ul> <p>Did I miss anything? Would that be enough to help the Pg usecase?</p> <p>I don't see us ever being able to reasonably support its current expectation that writeback errors will be seen on fd's that were opened after the error occurred. That's a really thorny problem from an object lifetime perspective.</p> </blockquote> <p>I think we can do better than XFS is currently doing (but I agree that we should have the same behaviour across all Linux filesystems!)</p> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> <li>Background writebacks should skip pages which are PageError.</li> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> <p>I think kupdate writes are the same as for_background writes. for_reclaim is tougher. I don't want to see us getting into OOM because we're hanging onto stale data, but we don't necessarily have an open fd to report the error on. I think I'm leaning towards behaving the same for for_reclaim as for_sync, but this is probably a subject on which reasonable people can disagree.</p> <p>And this logic all needs to be on one place, although invoked from each filesystem.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Fri, 13 Apr 2018 07:48:07 -0700 </code></pre> <p>On Tue, Apr 10, 2018 at 03:07:26PM -0700, Andres Freund wrote:</p> <blockquote> <p>I don't think that's the full issue. We can deal with the fact that an fsync failure is edge-triggered if there's a guarantee that every process doing so would get it. The fact that one needs to have an FD open from before any failing writes occurred to get a failure, <em>THAT'S</em> the big issue.</p> <p>Beyond postgres, it's a pretty common approach to do work on a lot of files without fsyncing, then iterate over the directory fsync everything, and <em>then</em> assume you're safe. But unless I severaly misunderstand something that'd only be safe if you kept an FD for every file open, which isn't realistic for pretty obvious reasons.</p> </blockquote> <p>While accepting that under memory pressure we can still evict the error indicators, we can do a better job than we do today. The current design of error reporting says that all errors which occurred before you opened the file descriptor are of no interest to you. I don't think that's necessarily true, and it's actually a change of behaviour from before the errseq work.</p> <p>Consider Stupid Task A which calls open(), write(), close(), and Smart Task B which calls open(), write(), fsync(), close() operating on the same file. If A goes entirely before B and encounters an error, before errseq_t, B would see the error from A's write.</p> <p>If A and B overlap, even a little bit, then B still gets to see A's error today. But if writeback happens for A's write before B opens the file then B will never see the error.</p> <p>B doesn't want to see historical errors that a previous invocation of B has already handled, but we know whether <em>anyone</em> has seen the error or not. So here's a patch which restores the historical behaviour of seeing old unhandled errors on a fresh file descriptor:</p> <p>Signed-off-by: Matthew Wilcox <a href="mailto:mawilcox@...rosoft.com">mawilcox@...rosoft.com</a></p> <pre><code>diff --git a/lib/errseq.c b/lib/errseq.c index df782418b333..093f1fba4ee0 100644 --- a/lib/errseq.c +++ b/lib/errseq.c @@ -119,19 +119,11 @@ EXPORT_SYMBOL(errseq_set); errseq_t errseq_sample(errseq_t *eseq) { errseq_t old = READ_ONCE(*eseq); - errseq_t new = old; - /* - * For the common case of no errors ever having been set, we can skip - * marking the SEEN bit. Once an error has been set, the value will - * never go back to zero. - */ - if (old != 0) { - new |= ERRSEQ_SEEN; - if (old != new) - cmpxchg(eseq, old, new); - } - return new; + /* If nobody has seen this error yet, then we can be the first. */ + if (!(old &amp; ERRSEQ_SEEN)) + old = 0; + return old; } EXPORT_SYMBOL(errseq_sample); </code></pre> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Sat, 14 Apr 2018 11:47:52 +1000 </code></pre> <p>On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 09:18:56AM -0400, Jeff Layton wrote:</p> <blockquote> <p>On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:</p> <blockquote> <p>To save you looking, XFS will trash the page contents completely on a filesystem level -&gt;writepage error. It doesn't mark them &quot;clean&quot;, doesn't attempt to redirty and rewrite them - it clears the uptodate state and may invalidate it completely. IOWs, the data written &quot;sucessfully&quot; to the cached page is now gone. It will be re-read from disk on the next read() call, in direct violation of the above POSIX requirements.</p> <p>This is my point: we've done that in XFS knowing that we violate POSIX specifications in this specific corner case - it's the lesser of many evils we have to chose between. Hence if we chose to encode that behaviour as the general writeback IO error handling algorithm, then it needs to done with the knowledge it is a specification violation. Not to mention be documented as a POSIX violation in the various relevant man pages and that this is how all filesystems will behave on async writeback error.....</p> </blockquote> <p>Got it, thanks.</p> <p>Yes, I think we ought to probably do the same thing globally. It's nice to know that xfs has already been doing this. That makes me feel better about making this behavior the gold standard for Linux filesystems.</p> <p>So to summarize, at this point in the discussion, I think we want to consider doing the following:</p> <ul> <li><p>better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.</p></li> <li><p>invalidate or clear uptodate flag on pages that experience writeback errors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.</p></li> </ul> <p>Did I miss anything? Would that be enough to help the Pg usecase?</p> <p>I don't see us ever being able to reasonably support its current expectation that writeback errors will be seen on fd's that were opened after the error occurred. That's a really thorny problem from an object lifetime perspective.</p> </blockquote> <p>I think we can do better than XFS is currently doing (but I agree that we should have the same behaviour across all Linux filesystems!)</p> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> </ol> </blockquote> <p>So you're saying we should treat it as a transient error rather than a permanent error.</p> <blockquote> <ol> <li>Background writebacks should skip pages which are PageError.</li> </ol> </blockquote> <p>That seems decidedly dodgy in the case where there is a transient error - it requires a user to specifically run sync to get the data to disk after the transient error has occurred. Say they don't notice the problem because it's fleeting and doesn't cause any obvious problems?</p> <p>e.g. XFS gets to enospc, runs out of reserve pool blocks so can't allocate space to write back the page, then space is freed up a few seconds later and so the next write will work just fine.</p> <p>This is a recipe for &quot;I lost data that I wrote /days/ before the system crashed&quot; bug reports.</p> <blockquote> <ol> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> </blockquote> <p>Which may well be unmount. Are we really going to wait until unmount to report fatal errors?</p> <p>We used to do this with XFS metadata. We'd just keep trying to write metadata and keep the filesystem running (because it's consistent in memory and it might be a transient error) rather than shutting down the filesystem after a couple of retries. the result was that users wouldn't notice there were problems until unmount, and the most common sympton of that was &quot;why is system shutdown hanging?&quot;.</p> <p>We now don't hang at unmount by default:</p> <pre><code>$ cat /sys/fs/xfs/dm-0/error/fail_at_unmount 1 $ </code></pre> <p>And we treat different errors according to their seriousness. EIO and device ENOSPC we default to retry forever because they are often transient, but for ENODEV we fail and shutdown immediately (someone pulled the USB stick out). metadata failure behaviour is configured via changing fields in /sys/fs/xfs//error/metadata//...</p> <p>We've planned to extend this failure configuration to data IO, too, but never quite got around to it yet. this is a clear example of &quot;one size doesn't fit all&quot; and I think we'll end up doing the same sort of error behaviour configuration in XFS for these cases. (i.e. /sys/fs/xfs//error/writeback//....)</p> <blockquote> <p>And this logic all needs to be on one place, although invoked from each filesystem.</p> </blockquote> <p>Perhaps so, but as there's no &quot;one-size-fits-all&quot; behaviour, I really want to extend the XFS error config infrastructure to control what the filesystem does on error here.</p> <hr /> <pre><code>From: Andres Freund &lt;andres@...razel.de&gt; Date: Fri, 13 Apr 2018 19:04:33 -0700 </code></pre> <p>Hi,</p> <p>On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:</p> <blockquote> <p>And we treat different errors according to their seriousness. EIO and device ENOSPC we default to retry forever because they are often transient, but for ENODEV we fail and shutdown immediately (someone pulled the USB stick out). metadata failure behaviour is configured via changing fields in /sys/fs/xfs//error/metadata//...</p> <p>We've planned to extend this failure configuration to data IO, too, but never quite got around to it yet. this is a clear example of &quot;one size doesn't fit all&quot; and I think we'll end up doing the same sort of error behaviour configuration in XFS for these cases. (i.e. /sys/fs/xfs//error/writeback//....)</p> </blockquote> <p>Have you considered adding an ext/fat/jfs errors=remount-ro/panic/continue style mount parameter?</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Fri, 13 Apr 2018 19:38:14 -0700 </code></pre> <p>On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:</p> <blockquote> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> </ol> </blockquote> <p>So you're saying we should treat it as a transient error rather than a permanent error.</p> </blockquote> <p>Yes, I'm proposing leaving the data in memory in case the user wants to try writing it somewhere else.</p> <blockquote> <blockquote> <ol> <li>Background writebacks should skip pages which are PageError.</li> </ol> </blockquote> <p>That seems decidedly dodgy in the case where there is a transient error - it requires a user to specifically run sync to get the data to disk after the transient error has occurred. Say they don't notice the problem because it's fleeting and doesn't cause any obvious problems?</p> </blockquote> <p>That's fair. What I want to avoid is triggering the same error every 30 seconds (or whatever the periodic writeback threshold is set to).</p> <blockquote> <p>e.g. XFS gets to enospc, runs out of reserve pool blocks so can't allocate space to write back the page, then space is freed up a few seconds later and so the next write will work just fine.</p> <p>This is a recipe for &quot;I lost data that I wrote /days/ before the system crashed&quot; bug reports.</p> </blockquote> <p>So ... exponential backoff on retries?</p> <blockquote> <blockquote> <ol> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> </blockquote> <p>Which may well be unmount. Are we really going to wait until unmount to report fatal errors?</p> </blockquote> <p>Goodness, no. The errors would be immediately reportable using the wb_err mechanism, as soon as the first error was encountered.</p> <hr /> <hr /> <pre><code>From: bfields@...ldses.org (J. Bruce Fields) Date: Wed, 18 Apr 2018 12:52:19 -0400 </code></pre> <blockquote> <p>Theodore Y. Ts'o - 10.04.18, 20:43:</p> <blockquote> <p>First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> </blockquote> </blockquote> <p>Pointers to documentation or papers or anything? The only google results I can find for &quot;power fail certified&quot; are your posts.</p> <p>I've always been confused by SSD power-loss protection, as nobody seems completely clear whether it's a safety or a performance feature.</p> <hr /> <pre><code>From: bfields@...ldses.org (J. Bruce Fields) Date: Wed, 18 Apr 2018 14:09:03 -0400 </code></pre> <p>On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote:</p> <blockquote> <p>On Apr 10, 2018, at 4:07 PM, Andres Freund <a href="mailto:andres@...razel.de">andres@...razel.de</a> wrote:</p> <blockquote> <p>2018-04-10 18:43:56 Ted wrote:</p> <blockquote> <p>So for better or for worse, there has not been as much investment in buffered I/O and data robustness in the face of exception handling of storage devices.</p> </blockquote> <p>That's a bit of a cop out. It's not just databases that care. Even more basic tools like SCM, package managers and editors care whether they can proper responses back from fsync that imply things actually were synced.</p> </blockquote> <p>Sure, but it is mostly PG that is doing (IMHO) crazy things like writing to thousands(?) of files, closing the file descriptors, then expecting fsync() on a newly-opened fd to return a historical error.</p> </blockquote> <p>It's not just postgres. dpkg (underlying apt, on debian derived distros) to take an example I just randomly guessed, does too: /* We want to guarantee the extracted files are on the disk, so that the * subsequent renames to the info database do not end up with old or zero * length files in case of a system crash. As neither dpkg-deb nor tar do * explicit fsync()s, we have to do them here. * XXX: This could be avoided by switching to an internal tar extractor. */ dir_sync_contents(cidir);</p> <p>(a bunch of other places too)</p> <p>Especially on ext3 but also on newer filesystems it's performancewise entirely infeasible to fsync() every single file individually - the performance becomes entirely attrocious if you do that.</p> </blockquote> <p>Is that still true if you're able to use some kind of parallelism? (async io, or fsync from multiple processes?)</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Thu, 19 Apr 2018 09:59:50 +1000 </code></pre> <p>On Fri, Apr 13, 2018 at 07:04:33PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:</p> <blockquote> <p>And we treat different errors according to their seriousness. EIO and device ENOSPC we default to retry forever because they are often transient, but for ENODEV we fail and shutdown immediately (someone pulled the USB stick out). metadata failure behaviour is configured via changing fields in /sys/fs/xfs//error/metadata//...</p> <p>We've planned to extend this failure configuration to data IO, too, but never quite got around to it yet. this is a clear example of &quot;one size doesn't fit all&quot; and I think we'll end up doing the same sort of error behaviour configuration in XFS for these cases. (i.e. /sys/fs/xfs//error/writeback//....)</p> </blockquote> <p>Have you considered adding an ext/fat/jfs errors=remount-ro/panic/continue style mount parameter?</p> </blockquote> <p>That's for metadata writeback error behaviour, not data writeback IO errors.</p> <p>We are definitely not planning to add mount options to configure IO error behaviors. Mount options are a horrible way to configure filesystem behaviour and we've already got other, fine-grained configuration infrastructure for configuring IO error behaviour. Which, as I just pointed out, was designed to be be extended to data writeback and other operational error handling in the filesystem (e.g. dealing with ENOMEM in different ways).</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Thu, 19 Apr 2018 10:13:43 +1000 </code></pre> <p>On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:</p> <blockquote> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> </ol> </blockquote> <p>So you're saying we should treat it as a transient error rather than a permanent error.</p> </blockquote> <p>Yes, I'm proposing leaving the data in memory in case the user wants to try writing it somewhere else.</p> </blockquote> <p>And if it's getting IO errors because of USB stick pull? What then?</p> <blockquote> <blockquote> <blockquote> <ol> <li>Background writebacks should skip pages which are PageError.</li> </ol> </blockquote> <p>That seems decidedly dodgy in the case where there is a transient error - it requires a user to specifically run sync to get the data to disk after the transient error has occurred. Say they don't notice the problem because it's fleeting and doesn't cause any obvious problems?</p> </blockquote> <p>That's fair. What I want to avoid is triggering the same error every 30 seconds (or whatever the periodic writeback threshold is set to).</p> </blockquote> <p>So if kernel ring buffer overflows and so users miss the first error report, they'll have no idea that the data writeback is still failing?</p> <blockquote> <blockquote> <p>e.g. XFS gets to enospc, runs out of reserve pool blocks so can't allocate space to write back the page, then space is freed up a few seconds later and so the next write will work just fine.</p> <p>This is a recipe for &quot;I lost data that I wrote /days/ before the system crashed&quot; bug reports.</p> </blockquote> <p>So ... exponential backoff on retries?</p> </blockquote> <p>Maybe, but I don't think that actually helps anything and adds yet more &quot;when should we write this&quot; complication to inode writeback....</p> <blockquote> <blockquote> <blockquote> <ol> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> </blockquote> <p>Which may well be unmount. Are we really going to wait until unmount to report fatal errors?</p> </blockquote> <p>Goodness, no. The errors would be immediately reportable using the wb_err mechanism, as soon as the first error was encountered.</p> </blockquote> <p>But if there are no open files when the error occurs, that error won't get reported to anyone. Which means the next time anyone accesses that inode from a user context could very well be unmount or a third party sync/syncfs()....</p> <hr /> <pre><code>From: Eric Sandeen &lt;esandeen@...hat.com&gt; Date: Wed, 18 Apr 2018 19:23:46 -0500 </code></pre> <p>On 4/18/18 6:59 PM, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:04:33PM -0700, Andres Freund wrote:</p> <blockquote> <p>Hi,</p> <p>On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:</p> <blockquote> <p>And we treat different errors according to their seriousness. EIO and device ENOSPC we default to retry forever because they are often transient, but for ENODEV we fail and shutdown immediately (someone pulled the USB stick out). metadata failure behaviour is configured via changing fields in /sys/fs/xfs//error/metadata//...</p> <p>We've planned to extend this failure configuration to data IO, too, but never quite got around to it yet. this is a clear example of &quot;one size doesn't fit all&quot; and I think we'll end up doing the same sort of error behaviour configuration in XFS for these cases. (i.e. /sys/fs/xfs//error/writeback//....)</p> </blockquote> <p>Have you considered adding an ext/fat/jfs errors=remount-ro/panic/continue style mount parameter?</p> </blockquote> <p>That's for metadata writeback error behaviour, not data writeback IO errors.</p> </blockquote> <p>/me points casually at data_err=abort &amp; data_err=ignore in ext4...</p> <pre><code> data_err=ignore Just print an error message if an error occurs in a file data buffer in ordered mode. data_err=abort Abort the journal if an error occurs in a file data buffer in ordered mode. </code></pre> <p>Just sayin'</p> <blockquote> <p>We are definitely not planning to add mount options to configure IO error behaviors. Mount options are a horrible way to configure filesystem behaviour and we've already got other, fine-grained configuration infrastructure for configuring IO error behaviour. Which, as I just pointed out, was designed to be be extended to data writeback and other operational error handling in the filesystem (e.g. dealing with ENOMEM in different ways).</p> </blockquote> <p>I don't disagree, but there are already mount-option knobs in ext4, FWIW.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Wed, 18 Apr 2018 17:40:37 -0700 </code></pre> <p>On Thu, Apr 19, 2018 at 10:13:43AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:</p> <blockquote> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> </ol> </blockquote> <p>So you're saying we should treat it as a transient error rather than a permanent error.</p> </blockquote> <p>Yes, I'm proposing leaving the data in memory in case the user wants to try writing it somewhere else.</p> </blockquote> <p>And if it's getting IO errors because of USB stick pull? What then?</p> </blockquote> <p>I've been thinking about this. Ideally we want to pass some kind of notification all the way up to the desktop and tell the user to plug the damn stick back in. Then have the USB stick become the same blockdev that it used to be, and complete the writeback. We are so far from being able to do that right now that it's not even funny.</p> <blockquote> <blockquote> <blockquote> <blockquote> <ol> <li>Background writebacks should skip pages which are PageError.</li> </ol> </blockquote> <p>That seems decidedly dodgy in the case where there is a transient error - it requires a user to specifically run sync to get the data to disk after the transient error has occurred. Say they don't notice the problem because it's fleeting and doesn't cause any obvious problems?</p> </blockquote> <p>That's fair. What I want to avoid is triggering the same error every 30 seconds (or whatever the periodic writeback threshold is set to).</p> </blockquote> <p>So if kernel ring buffer overflows and so users miss the first error report, they'll have no idea that the data writeback is still failing?</p> </blockquote> <p>I wasn't thinking about kernel ringbuffer based reporting; I was thinking about errseq_t based reporting, so the application can tell the fsync failed and maybe does something application-level to recover like send the transactions across to another node in the cluster (or whatever this hypothetical application is).</p> <blockquote> <blockquote> <blockquote> <blockquote> <ol> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> </blockquote> <p>Which may well be unmount. Are we really going to wait until unmount to report fatal errors?</p> </blockquote> <p>Goodness, no. The errors would be immediately reportable using the wb_err mechanism, as soon as the first error was encountered.</p> </blockquote> <p>But if there are no open files when the error occurs, that error won't get reported to anyone. Which means the next time anyone accesses that inode from a user context could very well be unmount or a third party sync/syncfs()....</p> </blockquote> <p>Right. But then that's on the application.</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Wed, 18 Apr 2018 21:08:19 -0400 </code></pre> <p>On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>I've been thinking about this. Ideally we want to pass some kind of notification all the way up to the desktop and tell the user to plug the damn stick back in. Then have the USB stick become the same blockdev that it used to be, and complete the writeback. We are so far from being able to do that right now that it's not even funny.o</p> </blockquote> <p>Maybe we shouldn't be trying to do any of this in the kernel, or at least as little as possible in the kernel? Perhaps it would be better to do most of this as a device mapper hack; I suspect we'll need userspace help to igure out whether the user has plugged the same USB stick in, or a different USB stick, anyway.</p> <hr /> <hr /> <pre><code>From: Christoph Hellwig &lt;hch@...radead.org&gt; Date: Thu, 19 Apr 2018 01:39:04 -0700 </code></pre> <p>On Wed, Apr 18, 2018 at 12:52:19PM -0400, J. Bruce Fields wrote:</p> <blockquote> <blockquote> <p>Theodore Y. Ts'o - 10.04.18, 20:43:</p> <blockquote> <p>First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> </blockquote> </blockquote> <p>Pointers to documentation or papers or anything? The only google results I can find for &quot;power fail certified&quot; are your posts.</p> <p>I've always been confused by SSD power-loss protection, as nobody seems completely clear whether it's a safety or a performance feature.</p> </blockquote> <p>Devices from reputable vendors should always be power fail safe, bugs notwithstanding. What power-loss protection in marketing slides usually means is that an SSD has a non-volatile write cache. That is once a write is ACKed data is persisted and no additional cache flush needs to be sent. This is a feature only available in expensive eterprise SSDs as the required capacitors are expensive. Cheaper consumer or boot driver SSDs have a volatile write cache, that is we need to do a separate cache flush to persist data (REQ_OP_FLUSH in Linux). But a reasonable implementation of those still won't corrupt previously written data, they will just lose the volatile write cache that hasn't been flushed. Occasional bugs, bad actors or other issues might still happen.</p> <hr /> <pre><code>From: &quot;J. Bruce Fields&quot; &lt;bfields@...ldses.org&gt; Date: Thu, 19 Apr 2018 10:10:16 -0400 </code></pre> <p>On Thu, Apr 19, 2018 at 01:39:04AM -0700, Christoph Hellwig wrote:</p> <blockquote> <p>On Wed, Apr 18, 2018 at 12:52:19PM -0400, J. Bruce Fields wrote:</p> <blockquote> <blockquote> <p>Theodore Y. Ts'o - 10.04.18, 20:43:</p> <blockquote> <p>First of all, what storage devices will do when they hit an exception condition is quite non-deterministic. For example, the vast majority of SSD's are not power fail certified. What this means is that if they suffer a power drop while they are doing a GC, it is quite possible for data written six months ago to be lost as a result. The LBA could potentialy be far, far away from any LBA's that were recently written, and there could have been multiple CACHE FLUSH operations in the since the LBA in question was last written six months ago. No matter; for a consumer-grade SSD, it's possible for that LBA to be trashed after an unexpected power drop.</p> </blockquote> </blockquote> <p>Pointers to documentation or papers or anything? The only google results I can find for &quot;power fail certified&quot; are your posts.</p> <p>I've always been confused by SSD power-loss protection, as nobody seems completely clear whether it's a safety or a performance feature.</p> </blockquote> <p>Devices from reputable vendors should always be power fail safe, bugs notwithstanding. What power-loss protection in marketing slides usually means is that an SSD has a non-volatile write cache. That is once a write is ACKed data is persisted and no additional cache flush needs to be sent. This is a feature only available in expensive eterprise SSDs as the required capacitors are expensive. Cheaper consumer or boot driver SSDs have a volatile write cache, that is we need to do a separate cache flush to persist data (REQ_OP_FLUSH in Linux). But a reasonable implementation of those still won't corrupt previously written data, they will just lose the volatile write cache that hasn't been flushed. Occasional bugs, bad actors or other issues might still happen.</p> </blockquote> <p>Thanks! That was my understanding too. But then the name is terrible. As is all the vendor documentation I can find:</p> <blockquote> <p><a href="https://insights.samsung.com/2016/03/22/power-loss-protection-how-ssds-are-protecting-data-integrity-white-paper/">https://insights.samsung.com/2016/03/22/power-loss-protection-how-ssds-are-protecting-data-integrity-white-paper/</a></p> <p>&quot;Power loss protection is a critical aspect of ensuring data integrity, especially in servers or data centers.&quot;</p> <p><a href="https://www.intel.com/content/.../ssd-320-series-power-loss-data-protection-brief.pdf">https://www.intel.com/content/.../ssd-320-series-power-loss-data-protection-brief.pdf</a></p> <p>&quot;Data safety features prepare for unexpected power-loss and protect system and user data.&quot;</p> </blockquote> <p>Why do they all neglect to mention that their consumer drives are also perfectly capable of well-defined behavior after power loss, just at the expense of flush performance? It's ridiculously confusing.</p> <hr /> <pre><code>From: Matthew Wilcox &lt;willy@...radead.org&gt; Date: Thu, 19 Apr 2018 10:40:10 -0700 </code></pre> <p>On Wed, Apr 18, 2018 at 09:08:19PM -0400, Theodore Y. Ts'o wrote:</p> <blockquote> <p>On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>I've been thinking about this. Ideally we want to pass some kind of notification all the way up to the desktop and tell the user to plug the damn stick back in. Then have the USB stick become the same blockdev that it used to be, and complete the writeback. We are so far from being able to do that right now that it's not even funny.o</p> </blockquote> <p>Maybe we shouldn't be trying to do any of this in the kernel, or at least as little as possible in the kernel? Perhaps it would be better to do most of this as a device mapper hack; I suspect we'll need userspace help to igure out whether the user has plugged the same USB stick in, or a different USB stick, anyway.</p> </blockquote> <p>The device mapper target (dm-removable?) was my first idea too, but I kept thinking through use cases and I think we end up wanting this functionality in the block layer. Let's try a story.</p> <p>Stephen the PFY goes into the data centre looking to hotswap a failed drive. Due to the eight pints of lager he had for lunch, he pulls out the root drive instead of the failed drive. The air raid siren warbles and he realises his mistake, shoving the drive back in.</p> <p>CYOA:</p> <p>Currently: All writes are lost, calamities ensue. The PFY is fired.</p> <p>With dm-removable: Nobody thought to set up dm-removable on the root drive. Calamities still ensue, but now it's the BOFH's fault instead of the PFY's fault.</p> <p>Built into the block layer: After a brief hiccup while we reattach the drive to its block_device, the writes resume and nobody loses their job.</p> <hr /> <pre><code>From: &quot;Theodore Y. Ts'o&quot; &lt;tytso@....edu&gt; Date: Thu, 19 Apr 2018 19:27:15 -0400 </code></pre> <p>On Thu, Apr 19, 2018 at 10:40:10AM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>With dm-removable: Nobody thought to set up dm-removable on the root drive. Calamities still ensue, but now it's the BOFH's fault instead of the PFY's fault.</p> <p>Built into the block layer: After a brief hiccup while we reattach the drive to its block_device, the writes resume and nobody loses their job.</p> </blockquote> <p>What you're talking about is a deployment issue, though. Ultimately the distribution will set up dm-removable automatically if the user requests it, much like it sets up dm-crypt automatically for laptop users upon request.</p> <p>My concern is that not all removable devices have a globally unique id number available in hardware so the kernel can tell whether or not it's the same device that has been plugged in. There are hueristics you could use -- for example, you could look at the file system uuid plus the last fsck time. But they tend to be very file system specific, and not things we would want ot have in the kernel.</p> <hr /> <pre><code>From: Dave Chinner &lt;david@...morbit.com&gt; Date: Fri, 20 Apr 2018 09:28:59 +1000 </code></pre> <p>On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Thu, Apr 19, 2018 at 10:13:43AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:</p> <blockquote> <p>On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:</p> <blockquote> <p>On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:</p> <blockquote> <ol> <li>If we get an error while wbc-&gt;for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.</li> </ol> </blockquote> <p>So you're saying we should treat it as a transient error rather than a permanent error.</p> </blockquote> <p>Yes, I'm proposing leaving the data in memory in case the user wants to try writing it somewhere else.</p> </blockquote> <p>And if it's getting IO errors because of USB stick pull? What then?</p> </blockquote> <p>I've been thinking about this. Ideally we want to pass some kind of notification all the way up to the desktop and tell the user to plug the damn stick back in. Then have the USB stick become the same blockdev that it used to be, and complete the writeback. We are so far from being able to do that right now that it's not even funny.</p> </blockquote> <p><em>nod</em></p> <p>But in the meantime, device unplug (should give ENODEV, not EIO) is a fatal error and we need to toss away the data.</p> <blockquote> <blockquote> <blockquote> <blockquote> <blockquote> <ol> <li>Background writebacks should skip pages which are PageError.</li> </ol> </blockquote> <p>That seems decidedly dodgy in the case where there is a transient error - it requires a user to specifically run sync to get the data to disk after the transient error has occurred. Say they don't notice the problem because it's fleeting and doesn't cause any obvious problems?</p> </blockquote> <p>That's fair. What I want to avoid is triggering the same error every 30 seconds (or whatever the periodic writeback threshold is set to).</p> </blockquote> <p>So if kernel ring buffer overflows and so users miss the first error report, they'll have no idea that the data writeback is still failing?</p> </blockquote> <p>I wasn't thinking about kernel ringbuffer based reporting; I was thinking about errseq_t based reporting, so the application can tell the fsync failed and maybe does something application-level to recover like send the transactions across to another node in the cluster (or whatever this hypothetical application is).</p> </blockquote> <p>But if it's still failing, then we should be still trying to report the error. i.e. if fsync fails and the page remains dirty, then the next attmept to write it is a new error and fsync should report that. IOWs, I think we should be returning errors at every occasion errors need to be reported if we have a persistent writeback failure...</p> <blockquote> <blockquote> <blockquote> <blockquote> <blockquote> <ol> <li>for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.</li> </ol> </blockquote> <p>Which may well be unmount. Are we really going to wait until unmount to report fatal errors?</p> </blockquote> <p>Goodness, no. The errors would be immediately reportable using the wb_err mechanism, as soon as the first error was encountered.</p> </blockquote> <p>But if there are no open files when the error occurs, that error won't get reported to anyone. Which means the next time anyone accesses that inode from a user context could very well be unmount or a third party sync/syncfs()....</p> </blockquote> <p>Right. But then that's on the application.</p> </blockquote> <p>Which we know don't do the right thing. Seems like a lot of hoops to jump through given it still won't work if the appliction isn't changed to support linux specific error handling requirements...</p> <hr /> <pre><code>From: Jan Kara &lt;jack@...e.cz&gt; Date: Sat, 21 Apr 2018 18:59:54 +0200 </code></pre> <p>On Fri 13-04-18 07:48:07, Matthew Wilcox wrote:</p> <blockquote> <p>On Tue, Apr 10, 2018 at 03:07:26PM -0700, Andres Freund wrote:</p> <blockquote> <p>I don't think that's the full issue. We can deal with the fact that an fsync failure is edge-triggered if there's a guarantee that every process doing so would get it. The fact that one needs to have an FD open from before any failing writes occurred to get a failure, <em>THAT'S</em> the big issue.</p> <p>Beyond postgres, it's a pretty common approach to do work on a lot of files without fsyncing, then iterate over the directory fsync everything, and <em>then</em> assume you're safe. But unless I severaly misunderstand something that'd only be safe if you kept an FD for every file open, which isn't realistic for pretty obvious reasons.</p> </blockquote> <p>While accepting that under memory pressure we can still evict the error indicators, we can do a better job than we do today. The current design of error reporting says that all errors which occurred before you opened the file descriptor are of no interest to you. I don't think that's necessarily true, and it's actually a change of behaviour from before the errseq work.</p> <p>Consider Stupid Task A which calls open(), write(), close(), and Smart Task B which calls open(), write(), fsync(), close() operating on the same file. If A goes entirely before B and encounters an error, before errseq_t, B would see the error from A's write.</p> <p>If A and B overlap, even a little bit, then B still gets to see A's error today. But if writeback happens for A's write before B opens the file then B will never see the error.</p> <p>B doesn't want to see historical errors that a previous invocation of B has already handled, but we know whether <em>anyone</em> has seen the error or not. So here's a patch which restores the historical behaviour of seeing old unhandled errors on a fresh file descriptor:</p> <p>Signed-off-by: Matthew Wilcox <a href="mailto:mawilcox@...rosoft.com">mawilcox@...rosoft.com</a></p> </blockquote> <p>So I agree with going to the old semantics of reporting errors from before a file was open at least once to someone. As the PG case shows apps are indeed relying on the old behavior. As much as it is unreliable, it ends up doing the right thing for these apps in 99% of cases and we shouldn't break them (BTW IMO the changelog should contain a note that this fixes a regression of PostgreSQL, a reference to this thread and CC to stable). Anyway feel free to add:</p> <p>Reviewed-by: Jan Kara <a href="mailto:jack@...e.cz">jack@...e.cz</a></p> <p>Oh, and to make myself clear I do think we need to find a better way of reporting IO errors. I consider this just an immediate band-aid to avoid userspace regressions.</p> <blockquote> <p>diff --git a/lib/errseq.c b/lib/errseq.c index df782418b333..093f1fba4ee0 100644 --- a/lib/errseq.c +++ b/lib/errseq.c @@ -119,19 +119,11 @@ EXPORT_SYMBOL(errseq_set); errseq_t errseq_sample(errseq_t *eseq) { errseq_t old = READ_ONCE(*eseq); - errseq_t new = old;</p> <ul> <li>/*</li> <li>* For the common case of no errors ever having been set, we can skip</li> <li>* marking the SEEN bit. Once an error has been set, the value will</li> <li>* never go back to zero.</li> <li>*/</li> <li>if (old != 0) {</li> <li>new |= ERRSEQ_SEEN;</li> <li>if (old != new)</li> <li>cmpxchg(eseq, old, new);</li> <li>}</li> <li>return new;</li> <li>/* If nobody has seen this error yet, then we can be the first. */</li> <li>if (!(old &amp; ERRSEQ_SEEN))</li> <li>old = 0;</li> <li>return old;</li> </ul> </blockquote> <hr /> <pre><code>From: Jan Kara &lt;jack@...e.cz&gt; Date: Sat, 21 Apr 2018 20:14:29 +0200 </code></pre> <p>On Thu 12-04-18 07:09:14, Jeff Layton wrote:</p> <blockquote> <p>On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:</p> <blockquote> <p>At the moment, when we open a file, we sample the current state of the writeback error and only report new errors. We could set it to zero instead, and report the most recent error as soon as anything happens which would report an error. That way err = close(open(&quot;file&quot;)); would report the most recent error.</p> <p>That's not going to be persistent across the data structure for that inode being removed from memory; we'd need filesystem support for persisting that. But maybe it's &quot;good enough&quot; to only support it for recent files.</p> <p>Jeff, what do you think?</p> </blockquote> <p>I hate it :). We could do that, but....yecchhhh.</p> <p>Reporting errors only in the case where the inode happened to stick around in the cache seems too unreliable for real-world usage, and might be problematic for some use cases. I'm also not sure it would really be helpful.</p> </blockquote> <p>So this is never going to be perfect but I think we could do good enough by: 1) Mark inodes that hit IO error. 2) If the inode gets evicted from memory we store the fact that we hit an error for this IO in a more space efficient data structure (sparse bitmap, radix tree, extent tree, whatever). 3) If the underlying device gets destroyed, we can just switch the whole SB to an error state and forget per inode info. 4) If there's too much of per-inode error info (probably per-fs configurable limit in terms of number of inodes), we would yell in the kernel log, switch the whole fs to the error state and forget per inode info.</p> <p>This way there won't be silent loss of IO errors. Memory usage would be reasonably limited. It could happen the whole fs would switch to error state &quot;prematurely&quot; but if that's a problem for the machine, admin could tune the limit for number of inodes to keep IO errors for...</p> <blockquote> <p>I think the crux of the matter here is not really about error reporting, per-se.</p> </blockquote> <p>I think this is related but a different question.</p> <blockquote> <p>I asked this at LSF last year, and got no real answer:</p> <p>When there is a writeback error, what should be done with the dirty page(s)? Right now, we usually just mark them clean and carry on. Is that the right thing to do?</p> <p>One possibility would be to invalidate the range that failed to be written (or the whole file) and force the pages to be faulted in again on the next access. It could be surprising for some applications to not see the results of their writes on a subsequent read after such an event.</p> <p>Maybe that's ok in the face of a writeback error though? IDK.</p> </blockquote> <p>I can see the admin wanting to rather kill the machine with OOM than having to deal with data loss due to IO errors (e.g. if he has HA server fail over set up). Or retry for some time before dropping the dirty data. Or do what we do now (possibly with invalidating pages as you say). As Dave said elsewhere there's not one strategy that's going to please everybody. So it might be beneficial to have this configurable like XFS has it for metadata.</p> <p>OTOH if I look at the problem from application developer POV, most apps will just declare game over at the face of IO errors (if they take care to check for them at all). And the sophisticated apps that will try some kind of error recovery have to be prepared that the data is just gone (as depending on what exactly the kernel does is rather fragile) so I'm not sure how much practical value the configurable behavior on writeback errors would bring.</p> <hr /> <p></p>