<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tom Hipwell</title><link>https://tomhipwell.co/</link><description>Recent content on Tom Hipwell</description><generator>Hugo -- gohugo.io</generator><language>en-gb</language><lastBuildDate>Wed, 08 Apr 2026 20:28:53 +0000</lastBuildDate><atom:link href="https://tomhipwell.co/index.xml" rel="self" type="application/rss+xml"/><item><title>Bombadil</title><link>https://tomhipwell.co/blog/bombadil/</link><pubDate>Wed, 08 Apr 2026 20:28:53 +0000</pubDate><guid>https://tomhipwell.co/blog/bombadil/</guid><description>I have loved following along with Will Wilson&amp;rsquo;s Antithesis. Their blog posts and podcasts are consistently really strong. They&amp;rsquo;ve recently hired the creator of Hypothesis, one of those fuzz testing frameworks that I&amp;rsquo;ve read lots about but never used in anger, probably because of my poor understanding of their value (would the bugs I found actually matter?).
It&amp;rsquo;s an interesting development as it&amp;rsquo;s a bit of a change in direction for Antithesis.</description></item><item><title>Impeccable</title><link>https://tomhipwell.co/blog/impeccable/</link><pubDate>Wed, 08 Apr 2026 13:30:39 +0000</pubDate><guid>https://tomhipwell.co/blog/impeccable/</guid><description>Design language is definitely something I struggle with, I find prebuilt skills bundles a little heavy weight (not sure I want an extra 20 commands&amp;hellip;) but I&amp;rsquo;ve bookmarked this to pick through later.</description></item><item><title>We Used Autoresearch on Our AI Skill, It Taught Us to Write Better Tests</title><link>https://tomhipwell.co/blog/we_used_autoresearch_on_our_ai_skill_it_taught_us_to_write_better_tests/</link><pubDate>Sat, 28 Mar 2026 21:16:20 +0000</pubDate><guid>https://tomhipwell.co/blog/we_used_autoresearch_on_our_ai_skill_it_taught_us_to_write_better_tests/</guid><description>Folks have been having a lot of fun with autoresearch, I haven&amp;rsquo;t yet tried it out but I&amp;rsquo;m keen to. This article from the Langfuse team is a nice summary of the tradeoffs consciously (or unconsciously) being made:
Autoresearch optimizes for exactly what you measure given the context you execute in. If your target function has gaps, it will find and exploit them. The community around autoresearch has been raising this same concern: it&amp;rsquo;s Goodhart&amp;rsquo;s Law at machine speed.</description></item><item><title>How we made Ramp Sheets self-maintaining</title><link>https://tomhipwell.co/blog/how_we_made_ramp_sheets_self_maintaining/</link><pubDate>Tue, 24 Mar 2026 23:12:16 +0000</pubDate><guid>https://tomhipwell.co/blog/how_we_made_ramp_sheets_self_maintaining/</guid><description>Definitely starting to hear of more places rolling their own systems like this and having success. The feedback loop between the agent and the environment matters so much that commercial software likely won&amp;rsquo;t work in this space for a while, as the article points out:
Current frontier models are very capable at a wide range of software engineering tasks, but they cannot synthesize a large codebase with a large observability surface and determine what needs attention.</description></item><item><title>Quoting Andrii Yakovenko</title><link>https://tomhipwell.co/blog/quoting_andrii_yakovenko/</link><pubDate>Mon, 23 Mar 2026 09:12:56 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_andrii_yakovenko/</guid><description>Another nice idea from Intercom:
So we built a publishing platform. In the same conversation where you create something, you say &amp;ldquo;share this&amp;rdquo; and your work - a fully interactive web app with charts, tables, and filters - gets published to a shared catalog with its own URL. It has versioning so you can iterate. Multiple people can contribute to the same page. View tracking so you know who&amp;rsquo;s using it.</description></item><item><title>Every layer of review makes you 10x slower</title><link>https://tomhipwell.co/blog/every_layer_of_review_makes_you_10x_slower/</link><pubDate>Wed, 18 Mar 2026 23:37:03 +0000</pubDate><guid>https://tomhipwell.co/blog/every_layer_of_review_makes_you_10x_slower/</guid><description>The truism in the title tells you all you need to know. I&amp;rsquo;ve been pondering this question lately (to review or not to review?) and I think this post lands a series of sensible points. Worth a read.</description></item><item><title>Brian Scanlon on Claude Code at scale at Intercom</title><link>https://tomhipwell.co/blog/brian_scanlon_on_claude_code_at_scale_at_intercom/</link><pubDate>Tue, 17 Mar 2026 22:31:21 +0000</pubDate><guid>https://tomhipwell.co/blog/brian_scanlon_on_claude_code_at_scale_at_intercom/</guid><description>There were some wild stats out of Intercom the other week on their Claude authored PR ratio (they are through the 90% mark). In this twitter thread Brian breaks down some of the tooling they&amp;rsquo;ve built to make this possible. Rad.</description></item><item><title>The Self-Driving Codebase</title><link>https://tomhipwell.co/blog/the_self_driving_codebase/</link><pubDate>Sat, 28 Feb 2026 19:26:46 +0000</pubDate><guid>https://tomhipwell.co/blog/the_self_driving_codebase/</guid><description>Mileage might vary on how interesting the content is here, I think most of us are familiar with the arguments, but the microsite itself is gorgeous - the use of animations and different modes of interaction is so engaging that I had just had to link it here (mostly for my own bookmarks). I&amp;rsquo;m looking forward to more of the web looking and feeling like this. Enjoy.</description></item><item><title>How We Built Secure, Scalable Agent Sandbox Infrastructure</title><link>https://tomhipwell.co/blog/how_we_built_secure_scalable_agent_sandbox_infrastructure/</link><pubDate>Sat, 28 Feb 2026 17:06:42 +0000</pubDate><guid>https://tomhipwell.co/blog/how_we_built_secure_scalable_agent_sandbox_infrastructure/</guid><description>I&amp;rsquo;m a sucker for a write up on coding agent architecture, I think because I enjoy learning about how the sandboxing works. Here&amp;rsquo;s one from Browser Use (who&amp;rsquo;s technical writing is often great). It&amp;rsquo;s a simple, effective design - probably exactly as you would do it if you started from a blank sheet of paper. I always wondered about the cold starts so learning a little about Unikraft was interesting. The juice on the design is this part in the middle:</description></item><item><title>Frontier</title><link>https://tomhipwell.co/blog/frontier/</link><pubDate>Tue, 24 Feb 2026 22:00:11 +0000</pubDate><guid>https://tomhipwell.co/blog/frontier/</guid><description>About a year ago at Nory we looked at the METR plots for coding performance, at that time the benchmark was predicting an 80% success rate for tasks that take humans about an hour to be hit at the end 2026. Opus 4.6 just went through that threshold (80% success, 1h 3 minutes task length, Feb &amp;lsquo;26 release) 8-10 months ahead of schedule. I remember thinking this time last year that the trend looked punchy and I wasn&amp;rsquo;t sure where the improvements would keep coming from, yet here we are.</description></item><item><title>swe-rebench</title><link>https://tomhipwell.co/blog/swe_rebench/</link><pubDate>Fri, 13 Feb 2026 22:13:15 +0000</pubDate><guid>https://tomhipwell.co/blog/swe_rebench/</guid><description>A slightly different benchmark to the popular swe-bench. Instead of using a handcrafted (and public) set of tasks, the team behind swe-rebench have built a pipeline that continuously gathers problems from public repos. Each model is also given the same harness. This means the problem set is less likely to have been subsumed into training data, and each model is assessed on the same level playing field. Opus 4.6 is top, but 5.</description></item><item><title>Quoting Michael Dempsey</title><link>https://tomhipwell.co/blog/quoting_michael_dempsey/</link><pubDate>Sat, 07 Feb 2026 11:52:21 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_michael_dempsey/</guid><description>This effect has been very noticeable at Nory, folks really want to join and are attracted to the problem and the vibe. It was true for Bulb too. The best companies have it in bucketfuls. It gives you access to the best talent and this is a reinforcing positive feedback loop.
What matters now is identity, community, belonging, and being out in the world, experiencing life in various ways. This means the intersection of who you work with, why you’re building, and what you’re building really matters and will directly reflect your values and your ability to be in communities.</description></item><item><title>A Codebase by an Agent for an Agent</title><link>https://tomhipwell.co/blog/a_codebase_by_an_agent_for_an_agen/</link><pubDate>Sat, 24 Jan 2026 22:31:03 +0000</pubDate><guid>https://tomhipwell.co/blog/a_codebase_by_an_agent_for_an_agen/</guid><description>I thought this was really a novel perspective in the oceans of content on gencode out there. It&amp;rsquo;s also an interesting idea, writing frameworks inspired by the past and then leaning into the model instinct on code naming and organisation. If the temperature of the LLM call is low and the context is stable then it makes sense that it&amp;rsquo;s guesses for naming etc. will be similar each time, and you should get an efficiency gain from leaning into the LLMs having echoes of past solutions compressed in it&amp;rsquo;s weights.</description></item><item><title>The Bitter Lesson of Agent Frameworks</title><link>https://tomhipwell.co/blog/the_bitter_lesson_of_agent_frameworks/</link><pubDate>Sat, 17 Jan 2026 22:18:11 +0000</pubDate><guid>https://tomhipwell.co/blog/the_bitter_lesson_of_agent_frameworks/</guid><description>It&amp;rsquo;s amazing how fast the zeitgeist is swinging around here:
Every time you add a &amp;ldquo;smart&amp;rdquo; wrapper around model behavior - planning modules, verification layers, output parsers - you&amp;rsquo;re encoding what you think the model should do. But the model was trained on millions of examples. It has seen more patterns than you can anticipate. Your abstractions become constraints that prevent the model from using what it learned.
The Bitter Lesson from ML research is clear: general methods that leverage computation beat hand-crafted human knowledge every time.</description></item><item><title>The AI Tourist Problem</title><link>https://tomhipwell.co/blog/the_ai_tourist_problem/</link><pubDate>Wed, 14 Jan 2026 08:33:39 +0000</pubDate><guid>https://tomhipwell.co/blog/the_ai_tourist_problem/</guid><description>Kyle Poyar&amp;rsquo;s writing at Growth Unhinged is normally solid and well researched, plus a handy source of benchmarks if you&amp;rsquo;re trying to evaluate startups, so it tends to be one I watch out for. This piece has some I treating stats on NRR for B2B SaaS/B2C SaaS and AI companies. The number of datapoints vary and we should take the results with a pinch of salt as they&amp;rsquo;re based on scraped data but they do point to an interesting trend, the data shows:</description></item><item><title>qwen3-vl-embedding</title><link>https://tomhipwell.co/blog/qwen3_vl_embedding/</link><pubDate>Sun, 11 Jan 2026 19:21:41 +0000</pubDate><guid>https://tomhipwell.co/blog/qwen3_vl_embedding/</guid><description>Very exciting to have an open source vision language model this capable. The queries described in the post are so varied (and work across different axis - semantic understanding, text understanding, object/spatial recognition), I think I this type of technology being cheaply/easily available is a big unlock for a lot of interesting product work.</description></item><item><title>Quoting Vicki Boykis</title><link>https://tomhipwell.co/blog/quoting_vicki_boykis/</link><pubDate>Fri, 02 Jan 2026 22:43:22 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_vicki_boykis/</guid><description>Vicki Boykis&amp;rsquo; year in review is excellent throughout (as her writing always is) but I loved this line in particular:
The forking branches of a decision tree in a codebase, are likewise boundless, and the neat part is that there is no right answer. You are constrained by your business requirements, but the choice of implementation of those requirements is of an endless variety. It will depend on: the stack you already have, the budget for the rest of the stack, your own past experience and engineering values, the social norms and expectations of your team and their collective experience, the industry’s vocabulary, the language you’re writing in, its conventions and affordances, and how much time you have to actually think about, finish, and merge this current PR before you need to deploy.</description></item><item><title>Economics of Orbital vs Terrestrial Data Centers</title><link>https://tomhipwell.co/blog/economics_of_orbital_vs_terrestrial_data_centers/</link><pubDate>Sun, 28 Dec 2025 13:23:33 +0000</pubDate><guid>https://tomhipwell.co/blog/economics_of_orbital_vs_terrestrial_data_centers/</guid><description>Fun blog post from Andrew McCalip that attempts to build a model of the unit economics of orbital data centers. It looks like they&amp;rsquo;re just about feasible but really there&amp;rsquo;s only one player in town. This paragraph is key I think:
This isn&amp;rsquo;t about talent. It&amp;rsquo;s about integration. If you have to buy launch, buy buses, buy power hardware, buy deployment, and pay margin at every interface, you never get there.</description></item><item><title>Regenerative Software</title><link>https://tomhipwell.co/blog/regenerative_software/</link><pubDate>Tue, 23 Dec 2025 14:54:37 +0000</pubDate><guid>https://tomhipwell.co/blog/regenerative_software/</guid><description>Chad Fowlers take on principles for how the craft of software changes in the AI era:
The metaphor I keep returning to is the phoenix: systems designed to burn and be reborn, continuously, without losing their identity.
A regenerative system has a few defining traits:
Clear, durable boundaries that outlive any implementation Tests and evaluations that define correctness independently of code Automation that assumes replacement is normal, not exceptional Explicit acceptance that code will rot, drift, or become incomprehensible Cultural comfort with deletion, rewriting, and starting over In such systems, failure is localized, recovery is fast, and improvement emerges through iteration rather than preservation.</description></item><item><title>Don't Build Agents, Build Skills Instead</title><link>https://tomhipwell.co/blog/don_t_build_agents_build_skills_instead/</link><pubDate>Sun, 21 Dec 2025 14:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/don_t_build_agents_build_skills_instead/</guid><description>A short talk from AI Engineer conference in which two Anthropic engineers (Barry Zhang and Mahesh Murag) make the case that you don&amp;rsquo;t need to build agents. Instead use a general purpose agent (Claude Code) and then write skills (skills are just pe-canned prompts expressed as markdown). The advantage of this approach being that skills are simple, versionable and composable. This last point seems the most important, no more wrangling graphs of actions.</description></item><item><title>Quoting Andrej Karpathy</title><link>https://tomhipwell.co/blog/quoting_andrej_karpathy/</link><pubDate>Sun, 16 Nov 2025 22:31:31 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_andrej_karpathy/</guid><description>Andrej nearly summarises where we are today:
In this new programming paradigm then, the new most predictive feature to look at is verifiability. If a task/job is verifiable, then it is optimizable directly or via reinforcement learning, and a neural net can be trained to work extremely well. It&amp;rsquo;s about to what extent an AI can &amp;ldquo;practice&amp;rdquo; something. The environment has to be resettable (you can start a new attempt), efficient (a lot attempts can be made), and rewardable (there is some automated process to reward any specific attempt that was made).</description></item><item><title>The AI Bubble and the US Economy</title><link>https://tomhipwell.co/blog/the_ai_bubble_and_the_us_economy/</link><pubDate>Sun, 19 Oct 2025 06:55:01 +0000</pubDate><guid>https://tomhipwell.co/blog/the_ai_bubble_and_the_us_economy/</guid><description>I can never really tell how useful economic analysis are. Often the fundamentals are staring you in the face and then the bull market runs for years. Timing is everything. However, I thought this was a solid summary that feels balanced and thorough, so worth sharing.</description></item><item><title>When Will Quantum Computing Work?</title><link>https://tomhipwell.co/blog/when_will_quantum_computing_work/</link><pubDate>Sat, 11 Oct 2025 07:00:27 +0000</pubDate><guid>https://tomhipwell.co/blog/when_will_quantum_computing_work/</guid><description>Tom McCarthy breaks out the current state of quantum computing. For me what&amp;rsquo;s valuable here is not predictions on potential commercial applications or the timeline but instead the heuristic to use to track progress and the clear line in the sand for a commercially viable technology:
The key limitation is the size of the problem(s) that the QC can handle. Runtime, integration with real-time data, and performance vs classical optimization techniques also matter, but the main constraint is how many variables a 1 million qubit QC can handle.</description></item><item><title>Hacking with AI SASTs: An overview of 'AI Security Engineers' / 'LLM Security Scanners' for Penetration Testers and Security Teams</title><link>https://tomhipwell.co/blog/hacking_with_ai_sasts_an_overview_of_ai_security_engineers_llm_security_scanners_for_penetration_testers_and_security_teams/</link><pubDate>Thu, 02 Oct 2025 21:51:42 +0000</pubDate><guid>https://tomhipwell.co/blog/hacking_with_ai_sasts_an_overview_of_ai_security_engineers_llm_security_scanners_for_penetration_testers_and_security_teams/</guid><description>I enjoy posts like this deep dive from Joshua Rogers on &amp;ldquo;AI Security Engineers&amp;rdquo; as amidst so much noise they show the value that agents are adding at the frontier. Josh finds the tools generally useful, giving a good tear down in the post. I&amp;rsquo;m not quite convinced the tools are ready for prime time, there&amp;rsquo;s a few too many obvious gotchas outlined here (e.g. monorepo support, vulnerability to prompt injection).</description></item><item><title>Supporting our AI overlords: Redesigning data systems to be Agent-first</title><link>https://tomhipwell.co/blog/supporting_our_ai_overlords_redesigning_data_systems_to_be_agent_first/</link><pubDate>Thu, 18 Sep 2025 21:24:07 +0000</pubDate><guid>https://tomhipwell.co/blog/supporting_our_ai_overlords_redesigning_data_systems_to_be_agent_first/</guid><description>Interesting blog post on how database design changes for agentic workloads, first time I&amp;rsquo;ve seen the phrase &amp;ldquo;agentic speculation&amp;rdquo; used to describe agent querying patterns but it seems a good fit:</description></item><item><title>Git Cheat Sheet</title><link>https://tomhipwell.co/blog/git_cheat_sheet/</link><pubDate>Wed, 17 Sep 2025 06:10:46 +0000</pubDate><guid>https://tomhipwell.co/blog/git_cheat_sheet/</guid><description>A very simple thing but this cheat sheet is great, even has simple diagrams for the different merge strategies built in, which is probably the most common area of debate (and confusion) when working with teams. Handy.</description></item><item><title>LLMs as Retrieval and Recommendation Engines</title><link>https://tomhipwell.co/blog/llms_as_retrieval_and_recommendation_engines/</link><pubDate>Fri, 12 Sep 2025 17:37:47 +0000</pubDate><guid>https://tomhipwell.co/blog/llms_as_retrieval_and_recommendation_engines/</guid><description>Nice deep dive on using LLMs for retrieval/recommendation. It&amp;rsquo;s a two parter, and there&amp;rsquo;s also a great guide to building a retrieval engine using a constrained decoding approach with vLLM and a HF hosted model. The whole thing is about 30 LoC.</description></item><item><title>Embrace the Red's month of AI Bugs</title><link>https://tomhipwell.co/blog/embrace_the_red_s_month_of_ai_bugs/</link><pubDate>Sun, 07 Sep 2025 10:43:39 +0000</pubDate><guid>https://tomhipwell.co/blog/embrace_the_red_s_month_of_ai_bugs/</guid><description>I&amp;rsquo;ve really enjoyed following along with the Embrace the Red prompt injection series over the summer. Pretty much every major, hyped tool has been compromised by the same fatal flaw - LLMs today mix data and instructions in the same channel (the prompt) and the model doesn&amp;rsquo;t know how to separate the two things. The series finale (an old school self-replicating virus) is a particular treat. There&amp;rsquo;s not really (yet) a great pattern for solving this problem, there&amp;rsquo;s been a couple of papers but it looks like it will create a fundamental roadblock to the type of public-internet roaming, self directed agents which were expecting to be released.</description></item><item><title>Quoting Jamie Tomalin</title><link>https://tomhipwell.co/blog/quoting_jamie_tomalin/</link><pubDate>Mon, 30 Jun 2025 21:52:58 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_jamie_tomalin/</guid><description>Some words on AI strategy from Jamie:
Perhaps, the AI-maxi strategy is building vertically integrated operating companies which wield strategic control to own the upside of AI, fundamentally transforming the economics of their business relative to incumbents, enabling them to counter-position and disrupt by selling directly to the end customer?
e.g., Paloma Health, Convictional, Candidly
Jamie Tomalin</description></item><item><title>Armin Ronacher's Agentic Coding Recommendations</title><link>https://tomhipwell.co/blog/armin_ronacher_s_agentic_coding_recommendations/</link><pubDate>Sat, 14 Jun 2025 20:09:11 +0000</pubDate><guid>https://tomhipwell.co/blog/armin_ronacher_s_agentic_coding_recommendations/</guid><description>Armin Ronacher (creator of Flask) has a great piece on agentic coding patterns, insightful throughout but largely centered on the uplift you get from effective tool use:
Agentic coding&amp;rsquo;s inefficiency largely arises from inference cost and suboptimal tool usage. Let me reiterate: quick, clear tool responses are vital.
For this reason, he tends to avoid MCP:
The reason I barely use it is because Claude Code is very capable of just running regular tools.</description></item><item><title>Quoting Elad Gil</title><link>https://tomhipwell.co/blog/quoting_elad_gil/</link><pubDate>Tue, 03 Jun 2025 06:13:30 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_elad_gil/</guid><description>Elad Gil on AI Rollups:
“It just seems so obvious,” said Gil over a Zoom call earlier this week. “This type of generative AI is very good at understanding language, manipulating language, manipulating text, producing text. And that’s audio, that’s video, that includes coding, sales outreach, and different back-office processes.”
If you can “effectively transform some of those repetitive tasks into software,” he said, “you can increase the margins dramatically and create very different types of businesses.</description></item><item><title>Quoting Shunyu Yao</title><link>https://tomhipwell.co/blog/quoting_shunyu_yao/</link><pubDate>Sat, 17 May 2025 21:36:14 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_shunyu_yao/</guid><description>Shunyu Yao, a researcher from OpenAI who worked on Deep Research, makes the case for fundamentally altering our approach to benchmarking now we&amp;rsquo;re in &amp;ldquo;the second half&amp;rdquo;:
Inertia is natural, but here is the problem. AI has beat world champions at chess and Go, surpassed most humans on SAT and bar exams, and reached gold medal level on IOI and IMO. But the world hasn’t changed much, at least judged by economics and GDP.</description></item><item><title>Cursor: Security</title><link>https://tomhipwell.co/blog/cursor_security/</link><pubDate>Sun, 11 May 2025 20:43:23 +0000</pubDate><guid>https://tomhipwell.co/blog/cursor_security/</guid><description>Simon&amp;rsquo;s blog is a gold mine. He just runs that bit further than everyone else and it shows time and again. Here he uses Cursor&amp;rsquo;s GDPR subprocessor disclosure to document their stack (the use of Fireworks and Turbopuffer is the interesting bit here). The killer bit is the disclosure at the end though:
When operating in privacy mode - which they say is enabled by 50% of their users - they are careful not to store any raw code on their servers for longer than the duration of a single request.</description></item><item><title>You should have private evals</title><link>https://tomhipwell.co/blog/you_should_have_private_evals/</link><pubDate>Fri, 09 May 2025 21:02:25 +0000</pubDate><guid>https://tomhipwell.co/blog/you_should_have_private_evals/</guid><description>I think this is a very good post. Taking the time to test for yourself and understand how each model generation is useful to you, in your context is clearly going to be a big advantage. So much of the assessment of LLMs is vibes based that your own vibes matter most, so spending some time defining what they are is important. This blog offers a framework, and examples, of how to do just that.</description></item><item><title>Quoting Chris Paxton</title><link>https://tomhipwell.co/blog/quoting_chris_paxton/</link><pubDate>Fri, 09 May 2025 20:12:49 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_chris_paxton/</guid><description>Nice explainer that sets out the boundaries of the RL techniques now dominating progress in AI. The list quoted here neatly describes what the jagged edge of AI will look like for the next little while:
Reinforcement learning is a powerful tool. Right now, though, it’s best used when:
You have a verifiable problem: math, coding, robot grasping
You have a way to generate a ton of data in this domain, but can’t necessarily generate optimal or even good data</description></item><item><title>The Bull Case for an AI Native Investment Bank</title><link>https://tomhipwell.co/blog/the_bull_case_for_an_ai_native_investment_bank/</link><pubDate>Thu, 08 May 2025 21:20:31 +0000</pubDate><guid>https://tomhipwell.co/blog/the_bull_case_for_an_ai_native_investment_bank/</guid><description>YC&amp;rsquo;s call for startups for the summer &amp;lsquo;25 batch includes a section on Fullstack AI, I&amp;rsquo;ve written about AI Rollups a few times on this blog, but it looks like the model might now accelerate.
Coincidentally the same day OffDeal (a YC company) has published their blueprint for a rollup that takes on investment bank M&amp;amp;A. Somewhat unusually, there&amp;rsquo;s tonnes of detail in this strategy doc so I&amp;rsquo;ve pulled our a few interesting bits below.</description></item><item><title>A Short Note on Sycophants and Feedback Loops</title><link>https://tomhipwell.co/blog/a_short_note_on_sycophants_and_feedback_loops/</link><pubDate>Sat, 03 May 2025 12:06:48 +0000</pubDate><guid>https://tomhipwell.co/blog/a_short_note_on_sycophants_and_feedback_loops/</guid><description>This has been written about in a few places so I&amp;rsquo;ll keep it brief. It was interesting that one of the root causes (note, not the sole cause) of the ChatGPT sycophancy issues was the feedback loop from the thumbs up/down data on posts, from their blog post:
&amp;ldquo;We also teach our models how to apply these principles by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.&amp;rdquo;</description></item><item><title>The Leaderboard Illusion</title><link>https://tomhipwell.co/blog/the_leaderboard_illusion/</link><pubDate>Wed, 30 Apr 2025 11:11:41 +0000</pubDate><guid>https://tomhipwell.co/blog/the_leaderboard_illusion/</guid><description>Interesting paper from Cohere, I think this might cause a bit of a storm - basically it&amp;rsquo;s an investigation into biases towards closed source model companies (OpenAI, Meta, Google DeepMind are named) in Chatbot Arena.
There&amp;rsquo;s three ways that the proprietary shops are favoured:
There&amp;rsquo;s private testing practices that means these model providers are able to test multiple variants before public release, enabling selective disclosure of results.
Proprietary closed models are sampled at higher rates (numbers of battles) and have fewer models removed from the arena than open-source/open-weights equivalents.</description></item><item><title>Mixture of Experts</title><link>https://tomhipwell.co/blog/mixture_of_experts/</link><pubDate>Sun, 20 Apr 2025 20:23:44 +0000</pubDate><guid>https://tomhipwell.co/blog/mixture_of_experts/</guid><description>Insightful post from J Betker on the MoE architecture. Here&amp;rsquo;s a few grabs:
The fact that MoE has great scaling properties indicates that something deeper is amiss with this architectural construct. This turns out to be sparsity itself – it is a new free parameter to the scaling laws for which sparsity=1 is suboptimal. Put another way – Chinchilla scaling laws focus on the relationship between data and compute, but MoEs give us another lever: the number of parameters in a neural network.</description></item><item><title>Claude Code: Best practices for agentic coding</title><link>https://tomhipwell.co/blog/claude_code_best_practices_for_agentic_coding/</link><pubDate>Sun, 20 Apr 2025 08:17:08 +0000</pubDate><guid>https://tomhipwell.co/blog/claude_code_best_practices_for_agentic_coding/</guid><description>This is really good, well worth the investment of your time. There is a lot of novel insight here that will shortly become de rigueur. There&amp;rsquo;s a few bits worth calling out.
The models are now heavily tuned for too use as we all know. gh cli use is baked in:
Claude knows how to use the gh CLI to interact with GitHub for creating issues, opening pull requests, reading comments, and more.</description></item><item><title>A Realistic AI Timeline</title><link>https://tomhipwell.co/blog/a_realistic_ai_timeline/</link><pubDate>Sun, 13 Apr 2025 20:05:53 +0000</pubDate><guid>https://tomhipwell.co/blog/a_realistic_ai_timeline/</guid><description>Another AI prediction, but I think this one pinpoints some of the blockers much more clearly. In summary:
Roughly: generalist scaling does not work or, at least, not well enough to make meaningul sense for material deployment. Instead, most development, including agentification, happens in the smaller size range with specialized, opinionated training. Any actual &amp;ldquo;general intelligence&amp;rdquo; has to take an entirely different direction — one that is almost discouraged by formal evaluation.</description></item><item><title>Quoting Kent Beck</title><link>https://tomhipwell.co/blog/quoting_kent_beck/</link><pubDate>Sun, 13 Apr 2025 07:53:32 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_kent_beck/</guid><description>Well if Kent Beck is doing it:
Been vibe coding like a fiend. Task breakdown is a highly leveraged human decision. Coding models are both non-deterministic &amp;amp; sensitive to initial conditions. You&amp;rsquo;ll get very different results having your agent implement Task1-&amp;gt;Task2-&amp;gt;Task3 or Task2-&amp;gt;Task3-&amp;gt;Task1.
I don&amp;rsquo;t have good heuristics yet, I just observe that when I try to implement &amp;ldquo;the same thing&amp;rdquo; I get quite different results.
Kent Beck</description></item><item><title>Quoting Philip Tetlock</title><link>https://tomhipwell.co/blog/quoting_philip_tetlock/</link><pubDate>Thu, 10 Apr 2025 10:24:48 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_philip_tetlock/</guid><description>The master speaks on AI 2027 forecasts. The discussion of these forecasts has been rumbling on. Kokotajilo himself puts the probability of a supercoder on a 2027 timeline at around 50%
I&amp;rsquo;m also impressed by Kokotajilo’s 2021 AI forecasts. It raises confidence in his Scenario 2027. But by how much? Tricky! In my earliest work on subjective-probability forecasting, 1984-85, few forecasters guessed how radical a reformer Gorbachev would be. But they were also the slowest to foresee the collapse of USSR in 1991.</description></item><item><title>Quoting Neil Mehta</title><link>https://tomhipwell.co/blog/quoting_neil_mehta/</link><pubDate>Sun, 06 Apr 2025 08:23:14 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_neil_mehta/</guid><description>Great piece on Neil Mehta that has been doing the rounds this week. Interesting throughout, Green oaks is very focussed on the founder, which is normal at seed but typically has less emphasis at A, B and onwards. There&amp;rsquo;s nothing unusual in what he&amp;rsquo;s saying, I think what is unusual is the level of conviction with which they pursue that one thing.
“This is controversial,” Mehta replied, when asked if the Greenoaks machine has identified an ideal type, “but I do believe there’s an archetype for a great founder.</description></item><item><title>AI 2027</title><link>https://tomhipwell.co/blog/ai_2027/</link><pubDate>Fri, 04 Apr 2025 22:02:53 +0000</pubDate><guid>https://tomhipwell.co/blog/ai_2027/</guid><description>I definitely don&amp;rsquo;t agree with all the predictions here (why do AI nerds always get obsessed with making geopolitical predictions?) and after the end of &amp;lsquo;26 everything goes a bit crazy. However, I see a lot of weak, poorly specified AI predictions so when you see one this detailed I think it is worth paying attention to. As they note, after the end of &amp;lsquo;26 the confidence level drops off. I&amp;rsquo;d suggest stopping reading at that point to save yourself the time (it&amp;rsquo;s highly speculative), though the predictions are quite fun.</description></item><item><title>As AI’s power grows, so does our workday</title><link>https://tomhipwell.co/blog/as_ai_s_power_grows_so_does_our_workday/</link><pubDate>Sat, 29 Mar 2025 12:09:44 +0000</pubDate><guid>https://tomhipwell.co/blog/as_ai_s_power_grows_so_does_our_workday/</guid><description>AI increases labour supply rather than reduces it, and watch out for those second order effects on society at large:
Occupations more exposed to generative AI saw a rise in work hours immediately following the release of ChatGPT. Compared to workers less exposed to generative AI (such as tire builders, wellhead pumpers, and surgical assistants) those in high-exposure occupations (including computer systems analysts, credit counsellors, and logisticians) worked roughly 3.15 hours more per week in the post-ChatGPT period.</description></item><item><title>Jevons Paradox: A Personal Perspective</title><link>https://tomhipwell.co/blog/jevons_paradox_a_personal_perspective/</link><pubDate>Thu, 27 Mar 2025 20:29:43 +0000</pubDate><guid>https://tomhipwell.co/blog/jevons_paradox_a_personal_perspective/</guid><description>Great post from Tina He on the future of work in the era of AI. Firstly, we&amp;rsquo;ve been coming at things all wrong:
Traditional economics might predict that AI-boosted productivity would reduce working hours, a four-day weekend for tasks that once took five days. But reality has different plans. We&amp;rsquo;re witnessing what I call the &amp;ldquo;labor rebound effect&amp;rdquo;—productivity doesn&amp;rsquo;t eliminate work; it transforms it, multiplies it, elevates its complexity. The time saved becomes time reinvested, often with compound interest.</description></item><item><title>Quoting Ankit Maloo</title><link>https://tomhipwell.co/blog/quoting_ankit_maloo/</link><pubDate>Mon, 24 Mar 2025 16:42:56 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_ankit_maloo/</guid><description>Similar to the Model is the Product a couple of weeks ago, the bitter lesson here is that brute forcing problems with compute wins versus clever solutions. Scaling compute at inference time with RL is the latest application of the bitter lesson, and we&amp;rsquo;re already seeing it move the needle in production use cases (customer support and soon, coding). This has big ramifications in the AI application layer:
While many companies are focused on building wrappers around generic models, essentially constraining the model to follow specific workflow paths, the real breakthrough would come from companies investing in post-training RL compute.</description></item><item><title>Cursor rules, prompt injections, voice to text and Diane</title><link>https://tomhipwell.co/blog/cursor_rules_prompt_injections_voice_to_text_and_diane/</link><pubDate>Sun, 23 Mar 2025 20:33:00 +0000</pubDate><guid>https://tomhipwell.co/blog/cursor_rules_prompt_injections_voice_to_text_and_diane/</guid><description>Let&amp;rsquo;s join the dots between a few different themes this week.
First up, cursor rule files are vulnerable to prompt injection attacks. It&amp;rsquo;s possible to embed prompts within the rules files and hide them using invisible characters.
You can then use this poisoned rule file to redirect cursor/your agentic IDE of choice towards malicious implementations. This is not a huge surprise - the point of rules files is to direct the LLM towards specific implementations.</description></item><item><title>The Model is the Product</title><link>https://tomhipwell.co/blog/the_model_is_the_product/</link><pubDate>Sun, 02 Mar 2025 18:21:16 +0000</pubDate><guid>https://tomhipwell.co/blog/the_model_is_the_product/</guid><description>I think this is a strong take on the on the consequences of the recent RL breakthroughs from Alexander Doria:
I think it&amp;rsquo;s time to call it: the model is the product.
All current factors in research and market development push in this direction.
Generalist scaling is stalling. This was the whole message behind the release of GPT-4.5: capacities are growing linearly while compute cost are on a geometric curve. Even with all the efficiency gains in training and infrastructure of the past two years, OpenAI can&amp;rsquo;t deploy this giant model with a remotely affordable pricing.</description></item><item><title>Claude 3.7 Sonnet</title><link>https://tomhipwell.co/blog/claude_3_7_sonnet/</link><pubDate>Mon, 24 Feb 2025 19:53:21 +0000</pubDate><guid>https://tomhipwell.co/blog/claude_3_7_sonnet/</guid><description>Lots to digest here. A few pull quotes from the press release. Coding use cases are the focus of the upgraded model:
Claude 3.7 Sonnet shows particularly strong improvements in coding and front-end web development. Along with the model, we’re also introducing a command line tool for agentic coding, Claude Code. Claude Code is available as a limited research preview, and enables developers to delegate substantial engineering tasks to Claude directly from their terminal.</description></item><item><title>On ICPs vs Strength of PMF</title><link>https://tomhipwell.co/blog/on_icps_vs_strength_of_pmf/</link><pubDate>Fri, 21 Feb 2025 13:00:58 +0000</pubDate><guid>https://tomhipwell.co/blog/on_icps_vs_strength_of_pmf/</guid><description>Hot take. Product teams talking too much about ICPs is a red flag. ICPs are for sales and marketing teams. They&amp;rsquo;re at the blunt end and need to narrow their focus to maximise win rate and build a hyper efficient growth engine.
Product teams need to know and understand their ICP to support prioritisation, but they should be thinking in terms of Product Market Fit strength across segments. We need to have that peripheral vision and understand the whole picture.</description></item><item><title>Karpathy's Vibes Check</title><link>https://tomhipwell.co/blog/karpathy_s_vibes_check/</link><pubDate>Tue, 18 Feb 2025 12:41:04 +0000</pubDate><guid>https://tomhipwell.co/blog/karpathy_s_vibes_check/</guid><description>I thought this post was interesting, not so much for conclusion about Grok 3 but instead for the range of tests that Andrej performs to get a feel for the capabilities of the model in &amp;lt;=~2 hours. It&amp;rsquo;s all there - the recall/reasoning without search of the GPT-2 training FLOPs, a few varied dev tasks, research tasks, search tasks (including a gut feel for hallucinations), ethics, personality, then a battery of standard LLM assessments (&amp;lsquo;r&amp;rsquo;s in strawberry, 9.</description></item><item><title>Quoting Harper Reed</title><link>https://tomhipwell.co/blog/quoting_harper_reed/</link><pubDate>Mon, 17 Feb 2025 19:42:24 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_harper_reed/</guid><description>Another day, another AI dev flow. There&amp;rsquo;s some common patterns emerging now (use of markdown files like spec.md, todo.md etc.) and I thought the blog gave a nice step by step guide and prompts to borrow. Basically the advice reduces to &amp;ldquo;spend a lot of time planning with reasoning models up front&amp;rdquo;. I liked this thought too:
I have spent years coding by myself, years coding as a pair, and years coding in a team.</description></item><item><title>Quoting Nelson Elhage</title><link>https://tomhipwell.co/blog/quoting_nelson_elhage/</link><pubDate>Sun, 09 Feb 2025 19:44:17 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_nelson_elhage/</guid><description>Great post from Nelson Elhage (Anthropic pre-training team) on adventures coding with Sonnet. Much of the post just describes the same journey that a lot of us are on at the moment (I&amp;rsquo;m still finding these posts fun to read, I wonder when the sense of wonder will be replaced by one of fatigue?), but there&amp;rsquo;s a couple of thoughtful nuggets towards the end that I&amp;rsquo;ve pulled out here:
You can now generate thousands of lines of code at a price of mere cents; but no human will understand them, and the LLMs are, for now, worse at debugging and refactoring and designing and maintaining those lines of code than they are at generating them.</description></item><item><title>S1: Scalable test-time compute for $6</title><link>https://tomhipwell.co/blog/s1_scalable_test_time_compute_for_6/</link><pubDate>Wed, 05 Feb 2025 22:20:37 +0000</pubDate><guid>https://tomhipwell.co/blog/s1_scalable_test_time_compute_for_6/</guid><description>The title is a little click-baity, but the analysis of the paper in the blog is great. A fast download of one (quite hacky, fun) approach to getting scalable test-time compute.</description></item><item><title>Quoting Dario Amodei</title><link>https://tomhipwell.co/blog/quoting_dario_amodei/</link><pubDate>Wed, 29 Jan 2025 21:32:06 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_dario_amodei/</guid><description>The insights here are not novel, but Dario provides a strong mental model of how the AI system will keep evolving over time:
Shifting the curve. The field is constantly coming up with ideas, large and small, that make things more effective or efficient: it could be an improvement to the architecture of the model (a tweak to the basic Transformer architecture that all of today&amp;rsquo;s models use) or simply a way of running the model more efficiently on the underlying hardware.</description></item><item><title>More AI Rollups</title><link>https://tomhipwell.co/blog/more_ai_rollups/</link><pubDate>Fri, 24 Jan 2025 09:56:05 +0000</pubDate><guid>https://tomhipwell.co/blog/more_ai_rollups/</guid><description>Here they come, Rocketable is a YCW25 batch startup following the AI Rollup model (see previous post). The plan here is to purchase profitable SaaS companies throwing off cash and use that cash to bootstrap more purchases, Omaha style. The investment thesis is the application of AI/agents allows full automation of any work done by humans within these small SaaS co&amp;rsquo;s (as it&amp;rsquo;s likely to be generic one assumes).
Feels like a tricky one, the exact businesses willing to sell in this niche are likely to be those that are on paper killer small businesses today but very likely to be disrupted on a 5 year horizon, either by general purpose agents (as they are thin automation layers) or by the rapid decrease in the cost of lines of code.</description></item><item><title>AI Rollups</title><link>https://tomhipwell.co/blog/ai_rollups/</link><pubDate>Sat, 18 Jan 2025 09:08:41 +0000</pubDate><guid>https://tomhipwell.co/blog/ai_rollups/</guid><description>There&amp;rsquo;s a few pieces on AI Rollups floating around and I think it&amp;rsquo;s worth getting familiar with the model as it looks like a trend. This presentation has a full tear down of the model.
The tl;dr of that deck is that if you build a vertical SaaS product you can grab more return not by making pure software sales, but instead by buying businesses and then leading the transformation of applying the software to that business; this is known as the growth buyout.</description></item><item><title>2025 AI Engineer Reading List</title><link>https://tomhipwell.co/blog/2025_ai_engineer_reading_list/</link><pubDate>Thu, 02 Jan 2025 19:41:31 +0000</pubDate><guid>https://tomhipwell.co/blog/2025_ai_engineer_reading_list/</guid><description>Good list, I&amp;rsquo;ve read a few of these but lots more to work through. The framing here is useful; though the list of what to read shifts pretty much every week, I think it&amp;rsquo;s a good guide to the areas to sample from. I would add What are embeddings, Yi Model Series and Yann Lecun&amp;rsquo;s talk on Objective Driven AI.</description></item><item><title>Quoting Sean Goedecke</title><link>https://tomhipwell.co/blog/quoting_sean_goedecke/</link><pubDate>Thu, 02 Jan 2025 16:18:19 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_sean_goedecke/</guid><description>I&amp;rsquo;ve long thought consistency is king - I think this applies in codebases of all sizes, not just those in the single digit millions as Sean describes. Here&amp;rsquo;s the summary, though the full article is worth a read:
Large codebases are worth working in because they usually pay your salary
By far the most important thing is consistency
Never start a feature without first researching prior art in the codebase</description></item><item><title>DeepSeek-v3</title><link>https://tomhipwell.co/blog/deepseek_v3/</link><pubDate>Fri, 27 Dec 2024 10:50:43 +0000</pubDate><guid>https://tomhipwell.co/blog/deepseek_v3/</guid><description>DeepSeek-v3 dropped on Christmas day (!) a gigantic mixture of experts type model (671b total parameters) which sets a new SOTA performance for open source. Why should I care? What does this even mean? Well, the big news here is the training efficiency.
Firstly the total training cost was ~$5.5m (2.78m GPU hours). Now, this is the GPU cost of the training run only, not a total load (i.e. stuff like R&amp;amp;D and staffing costs are not included) but that&amp;rsquo;s a big gain.</description></item><item><title>Hot takes on o3</title><link>https://tomhipwell.co/blog/o3/</link><pubDate>Sun, 22 Dec 2024 11:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/o3/</guid><description>Everywhere seems to be full of hype around o3 since Friday&amp;rsquo;s annoucement from OpenAI so I thought I&amp;rsquo;d summarise a few points I&amp;rsquo;ve seen shared in various places but not yet gathered in one place. We&amp;rsquo;re going to zoom in mostly on the ARC-AGI results, as I think that is the most interesting part. Before we do that, let&amp;rsquo;s introduce the ARC challenge.
ARC (Abstract Reasoning Corpus) was designed/created by François Chovllet, Author of both Deep Learning with Python and the Keras framework (and ex-Google).</description></item><item><title>WebDev Leaderboard</title><link>https://tomhipwell.co/blog/webdev_leaderboard/</link><pubDate>Tue, 17 Dec 2024 12:18:09 +0000</pubDate><guid>https://tomhipwell.co/blog/webdev_leaderboard/</guid><description>Webdev Arena builds on the Chatbot Arena concept but provides a coding-specific benchmark that offers an extremely fast and cheap way for you to evaluate the vibes of the different models out there.
Given a prompt and two anonymised LLMs the arena builds two output React/Typescript/Tailwind apps side by side for you to evaluate - serving them up in an e2b standbox.
I suspect that as the frontier keeps moving it&amp;rsquo;s worth refining the prompt you use to test models (spend a bit of time making it hard), then each time a model is released on the leaderboard just come in and get a feel for how your own personal benchmark has changed.</description></item><item><title>Quoting Will Whitney</title><link>https://tomhipwell.co/blog/quoting_will_whitney/</link><pubDate>Mon, 16 Dec 2024 11:42:15 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_will_whitney/</guid><description>Some interesting ideas from Will on using generative AI to either manage the set of UI components shown to the user or generating the UI in raw pixels on the fly as we&amp;rsquo;re starting to see in gaming (i.e. Genie 2). I think a pixel based approach would be very complicated to do reliably, but an approach where a model dynamically generated the UI from a set of pre-defined components would be very interesting.</description></item><item><title>Byte Latent Transformer: Patches Scale Better Than Tokens</title><link>https://tomhipwell.co/blog/byte_latent_transformer_patches_scale_better_than_tokens/</link><pubDate>Fri, 13 Dec 2024 21:28:39 +0000</pubDate><guid>https://tomhipwell.co/blog/byte_latent_transformer_patches_scale_better_than_tokens/</guid><description>Interesting paper from Meta that has been generating some buzz:
We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency and robustness. BLT encodes bytes into dynamically sized patches, which serve as the primary units of computation. Patches are segmented dynamically based on the entropy of the next byte, allocating more compute and model capacity where increased data complexity demands it.</description></item><item><title>Sora: An idiot's guide</title><link>https://tomhipwell.co/blog/sora/</link><pubDate>Tue, 10 Dec 2024 11:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/sora/</guid><description>OpenAI | 2025 | Technical Report Sarah Guo, Elad Gill, Aditya Ramesh, Tim Brooks, Bill Peebles | 2025 | Podcast
This post has been sat in my drafts for well over 6 months now, but with yesterday&amp;rsquo;s release of Sora in GA I thought I&amp;rsquo;d have a go at explaining how Sora might be working under the hood, and in particular a breakthrough that OpenAI made (and I assume competitors have now replicated) called Latent Space Time Patches.</description></item><item><title>Fish Eyes</title><link>https://tomhipwell.co/blog/fish_eyes/</link><pubDate>Sat, 07 Dec 2024 09:25:52 +0000</pubDate><guid>https://tomhipwell.co/blog/fish_eyes/</guid><description>I thought this was a brilliant, thought-provoking piece on how to use zoom with text in the LLM era from Amelia Wattenberger. Worth it for the fish animations alone in my book (make sure to keep clicking as you scroll) but there&amp;rsquo;s a tonne of nice ideas here 👀</description></item><item><title>Aurora DSQL</title><link>https://tomhipwell.co/blog/aurora_dsql/</link><pubDate>Wed, 04 Dec 2024 11:11:07 +0000</pubDate><guid>https://tomhipwell.co/blog/aurora_dsql/</guid><description>Insightful piece from Marc Brooker on Aurora DSQL, which was announced at AWS re:invent this week. DSQL stands for &amp;ldquo;distributed sql&amp;rdquo;. The idea is to get ACID semantics at gigantic scale with Postgres compatibility (psql works with Aurora DSQL as a backend):
We built a team to go do something audacious: build a new distributed database system, with SQL and ACID, global active-active, scalability both up and down (with independent scaling of compute, reads, writes, and storage), PostgreSQL compatibility, and a serverless operational model.</description></item><item><title>Baked Search: Building semantic search quickly for toy use cases</title><link>https://tomhipwell.co/blog/baked_search_with_chromadb/</link><pubDate>Mon, 02 Dec 2024 13:19:00 +0000</pubDate><guid>https://tomhipwell.co/blog/baked_search_with_chromadb/</guid><description>Decent quality semantic search has got much easier and cheaper to ship yourself in the last couple of years. I thought I&amp;rsquo;d try and write a super quick guide that gets a search backend up and running as quickly and cheaply as possible.
The guide assumes that you have a toy use case - you&amp;rsquo;re building as a hobbyist. The example I&amp;rsquo;ve chosen is writing search for a blog - specifically a blog built using a static site generator like Hugo, Jekyll, Gatsby etc (like this one!</description></item><item><title>Meritech ServiceTitan S1 Analysis</title><link>https://tomhipwell.co/blog/meritech_servicetitan_s1_analysis/</link><pubDate>Sat, 30 Nov 2024 14:36:15 +0000</pubDate><guid>https://tomhipwell.co/blog/meritech_servicetitan_s1_analysis/</guid><description>This S1 analysis from Meritech went viral due to the (compounding!) IPO ratchet that ServiceTitan are subject to after the Series H funding they took 18 months ago. About halfway down there&amp;rsquo;s some handy benchmarks for median/top decile pre-IPO performance in vertical SaaS, I&amp;rsquo;ve pocketed them for reference (maybe they&amp;rsquo;ll come in handy one day!), so I thought I&amp;rsquo;d reproduce them here:
Performance by EV / ARR Percentile Top Decile Median ServiceTitan Financial Metrics Implied ARR ($M) $2,707 $923 $772 % YoY Implied ARR Growth 28% 17% 24% Net Dollar Retention 118% 110% 110% Implied ARR / FTE ($K) $589 $356 $269 LTM Implied Payback Period (Months) 20.</description></item><item><title>Function Calling with Llama-3.2-3B-Instruct</title><link>https://tomhipwell.co/blog/function_calling_with_llama_3_2_3b_instruct/</link><pubDate>Sat, 30 Nov 2024 14:05:21 +0000</pubDate><guid>https://tomhipwell.co/blog/function_calling_with_llama_3_2_3b_instruct/</guid><description>End to end tutorial of function calling with Llama-3.2-3B-Instruct, building gradually from string templating, to using Jinja, to implementing web search with Brave:</description></item><item><title>Quoting Google Rules of Machine Learning</title><link>https://tomhipwell.co/blog/quoting_google_rules_of_machine_learning/</link><pubDate>Wed, 27 Nov 2024 20:05:50 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_google_rules_of_machine_learning/</guid><description>The often forgotten first rule of ML is that you might be able to get a good enough result without ML:
Rule #1: Don’t be afraid to launch a product without machine learning.
Machine learning is cool, but it requires data. Theoretically, you can take data from a different problem and then tweak the model for a new product, but this will likely underperform basic heuristics. If you think that machine learning will give you a 100% boost, then a heuristic will get you 50% of the way there.</description></item><item><title>uv Cheat Sheet</title><link>https://tomhipwell.co/blog/uv_cheat_sheet/</link><pubDate>Tue, 26 Nov 2024 22:56:10 +0000</pubDate><guid>https://tomhipwell.co/blog/uv_cheat_sheet/</guid><description>This [cheat sheet] by Brandon Rohrer (https://www.brandonrohrer.com/uv_cheatsheet) works as a handy 30 second introduction to the python package/project management tool if you&amp;rsquo;ve not met it already. I&amp;rsquo;ve been teetering on the brink; I&amp;rsquo;m boring and have stuck with pip and pip-tools for a long time but it is so incredibly fast that I am tempted to move over.</description></item><item><title>Quoting AWS Lambda PR/FAQ</title><link>https://tomhipwell.co/blog/quoting_aws_lambda_pr_faq/</link><pubDate>Thu, 21 Nov 2024 11:51:59 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_aws_lambda_pr_faq/</guid><description>Brilliant throughout, with lots of small, golden nuggets. PR/FAQs like this are, for me, a bit wordy but you can see how effectively the technique is used here to explain the impact of not just a ground breaking new technical approach but also a major shift in AWS&amp;rsquo;s compute billing model to a wide audience:
When we launched Lambda, security was not negotiable – and we knew that there would be trade-offs.</description></item><item><title>OpenAI Email Archives (from Musk v. Altman)</title><link>https://tomhipwell.co/blog/openai_email_archives_from_musk_v_altman/</link><pubDate>Sat, 16 Nov 2024 14:39:30 +0000</pubDate><guid>https://tomhipwell.co/blog/openai_email_archives_from_musk_v_altman/</guid><description>There&amp;rsquo;s been a tranche of emails released as part of the Musk vs Altman stuff around OpenAI and it makes for some interesting reading.
One of the big things that jumps out is how much focus there is on crafting the narrative and mission for OpenAI.
They&amp;rsquo;re obsessed with getting the best talent (cheaply it seems), using the mission as the motivator:
Sam Altman to Elon Musk - Jun 24, 2015</description></item><item><title>The Effects of Generative AI on High Skilled Work: Evidence from Three Field Experiments with Software Developers</title><link>https://tomhipwell.co/blog/the_effects_of_generative_ai_on_high_skilled_work_evidence_from_three_field_experiments_with_software_developers/</link><pubDate>Tue, 12 Nov 2024 11:31:10 +0000</pubDate><guid>https://tomhipwell.co/blog/the_effects_of_generative_ai_on_high_skilled_work_evidence_from_three_field_experiments_with_software_developers/</guid><description>TL;DR: A study of ~5,000 engineers across Microsoft, Accenture, and a Fortune 100 company finds Github Copilot boosts weekly PRs by 26.08% (SE: 10.2%) - but the effect varies widely, with a 95% confidence interval from 5.88% to 46.28%. Adoption patterns show junior and newly tenured engineers are more likely to use Copilot (up to 9.5% higher). 30-40% of engineers didn’t use it at all.
This is a paper I saw posted about a bit during the summer that looks at the productivity impact of Github Copilot for organisations.</description></item><item><title>Quoting Google Big Sleep team</title><link>https://tomhipwell.co/blog/quoting_google_big_sleep_team/</link><pubDate>Fri, 08 Nov 2024 20:04:56 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_google_big_sleep_team/</guid><description>Pattern matching with LLMs used to find security vulns in the wild:
A key motivating factor for Naptime and now for Big Sleep has been the continued in-the-wild discovery of exploits for variants of previously found and patched vulnerabilities. As this trend continues, it&amp;rsquo;s clear that fuzzing is not succeeding at catching such variants, and that for attackers, manual variant analysis is a cost-effective approach.
We also feel that this variant-analysis task is a better fit for current LLMs than the more general open-ended vulnerability research problem.</description></item><item><title>Quoting Graham Paterson</title><link>https://tomhipwell.co/blog/quoting_graham_paterson/</link><pubDate>Tue, 05 Nov 2024 20:40:14 +0000</pubDate><guid>https://tomhipwell.co/blog/quoting_graham_paterson/</guid><description>Love a nice data driven product feedback loop; the Jitty folk have found a nice pattern with natural language search:
Over the weekend we quietly released a highly requested feature on Jitty: search by travel time 🚌🚶🚗🚴‍♀️🚂
We&amp;rsquo;ve partnered with the good people of the aptly named TravelTime to let homebuyers search by time rather than just distance.
Since we launched natural language search, we can see what people search for.</description></item><item><title>AI-Assisted Assessment of Coding Practices in Modern Code Review</title><link>https://tomhipwell.co/blog/ai_assisted_assessment_of_coding_practices_in_modern_code_review/</link><pubDate>Fri, 01 Nov 2024 11:37:11 +0000</pubDate><guid>https://tomhipwell.co/blog/ai_assisted_assessment_of_coding_practices_in_modern_code_review/</guid><description>Nice paper on AI assisted code review at Google. Three call outs that I thought were interesting (as I imagine that we&amp;rsquo;re about to be hit by a tidal wave of commercial applications of this idea):
(1) One of the issues is that the required training dataset varies by best practice - the currency of knowledge really matters. So for example the underlying model was trained on data prior to &amp;lsquo;22, but the canonical source of python type definitions has shifted about a fair bit from Python 3.</description></item><item><title>AI, Ad Dollars</title><link>https://tomhipwell.co/blog/ai_and_ad_dollars/</link><pubDate>Thu, 26 Sep 2024 12:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/ai_and_ad_dollars/</guid><description>I liked Ethan Mollick&amp;rsquo;s post on ad dollars earlier this week, here it is if you missed it:
No one has figured out how you integrate advertising with LLM replies. If it is contextual ads around the LLM, then a good LLM answer should provide more guidance to the product you want than ads, making the ads useless. If ads are integrated into the prompt, with the instructions that the advertiser be recommended, that will lead to inaccurate, bad answers.</description></item><item><title>AGI Predictions</title><link>https://tomhipwell.co/blog/on_agi_predictions/</link><pubDate>Wed, 12 Jun 2024 16:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/on_agi_predictions/</guid><description>I really enjoyed the nonint post on timelines to AGI. Obviously James Betker is better placed than me to make an informed prediction, and he has inside information (he works for OpenAI) but there&amp;rsquo;s a couple of things that jump out at me if I read this prediction critically.
Firstly, given that transformers are great general approximators of behaviour, it&amp;rsquo;s very difficult to falsify any predictions about AGI without having a very specific and testable definition of what AGI is that everyone agrees on.</description></item><item><title>dspy unpacked: continuous prompt optimisation</title><link>https://tomhipwell.co/blog/dspy/</link><pubDate>Fri, 26 Apr 2024 15:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/dspy/</guid><description>Omar Khattab, Chris Potts, Matei Zaharia | 2023 | Paper | Github | Docs
A lot of work with LLMs today involves working through a loop where you break a problem into steps, write a prompt for each step, then put the whole together by adjusting each prompt to feed into the next one.
dspy simplifies this process. It gives you a framework to structure your pipeline - forcing you to architect the application so your program flow is split from the variable stuff - the prompts and model weights that get fed to the LLM.</description></item><item><title>How many customer interviews are enough?</title><link>https://tomhipwell.co/blog/how_many_customer_interviews/</link><pubDate>Thu, 18 Apr 2024 13:30:00 +0000</pubDate><guid>https://tomhipwell.co/blog/how_many_customer_interviews/</guid><description>Counts of customer interviews seem to have become a bit of a vanity metric of late. A shorthand for product or decision quality, as if one automatically implies the other.
I appreciate your sacrifice at the temple of customer research, but I worry that you may have wasted your time.
Working out the right number of interviews, wireframe tests or customers in the alpha phase of your project is quite similar to an optimal stopping problem.</description></item><item><title>llm.c: The genius of Andrej Karpathy</title><link>https://tomhipwell.co/blog/llm_c/</link><pubDate>Thu, 11 Apr 2024 20:19:00 +0000</pubDate><guid>https://tomhipwell.co/blog/llm_c/</guid><description>What&amp;rsquo;s awesome about Andrej Karpathy&amp;rsquo;s llm.c isn&amp;rsquo;t just that it&amp;rsquo;s a bare-metal, from-scratch implementation of GPT-2 (safety wink definitely required!).
If you take a step back, you&amp;rsquo;ll see he&amp;rsquo;s also educating us on how one of the very best in the world hones their craft. He&amp;rsquo;s stripped away the intermediate layer of libraries - there&amp;rsquo;s no PyTorch here. Instead, we&amp;rsquo;re taken back to the basics: an attempt to implement a simple C and CUDA version of GPT-2 in ~1000 lines with no dependencies or frameworks involved.</description></item><item><title>March '24 Roundup</title><link>https://tomhipwell.co/blog/march_24_roundup/</link><pubDate>Sun, 31 Mar 2024 11:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/march_24_roundup/</guid><description>March was the month we got Grok, OpenAI confirmed their strategy and we no longer needed to run on vibes alone as gpt-4 was displaced at the top of the leaderboards. An experiment was also kicked off to learn about the pricing power of the major LLM providers.
One of the things I most enjoyed this month was the explosion of interest in LLM agents with the launch of Devin, the AI software engineer.</description></item><item><title>Hot takes on Devin, the AI software engineer</title><link>https://tomhipwell.co/blog/devin/</link><pubDate>Sun, 17 Mar 2024 15:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/devin/</guid><description>I thought Devin from Cognition looked super cool this week, the UX feels like a glimpse of a new era.
I wonder how deep the moat is though? 🤔
From staring a little bit too closely at the screenshots and videos I&amp;rsquo;ve seen so far, a hot take would be that it feels like most of the performance lift in the SWE benchmarks could come from a switch in prompting technique, i.</description></item><item><title>Grandmaster-Level Chess Without Search</title><link>https://tomhipwell.co/blog/grandmaster_chess_without_search/</link><pubDate>Sat, 16 Mar 2024 08:55:30 +0000</pubDate><guid>https://tomhipwell.co/blog/grandmaster_chess_without_search/</guid><description>Anian Ruoss, Grégoire Delétang, Sourabh Medapati, Jordi Grau-Moya, Li Kevin Wenliang, Elliot Catt, John Reid and Tim Genewein | Grandmaster Level Chess without Search | 2024 | Paper
Walter Isaacson | Elon Musk | 2023 | Book
Towards the end of Walter Isaacson&amp;rsquo;s biography of Elon Musk, there&amp;rsquo;s a description of a breakthrough with Tesla Autopilot:
For years, Tesla’s Autopilot system relied on a rules-based approach. It took visual data from a car’s cameras and identified such things as lane markings, pedestrians, vehicles, traffic signals, and a set of rules in a range of eight cameras.</description></item><item><title>February '24 Roundup</title><link>https://tomhipwell.co/blog/february_24_roundup/</link><pubDate>Mon, 26 Feb 2024 11:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/february_24_roundup/</guid><description>February feels like it&amp;rsquo;s gone in a blur. Hofy had a brilliant company retreat in Peniche, Portugal. Sora looks insane. Google returned to open source AI with the Gemma series while Mistral released a hosted, closed-source model. Here&amp;rsquo;s a few other things that caught the eye:
Self-Discover, Google DeepMind Can we improve LLM reasoning by adjusting the way in which we prompt? Google DeepMind demonstrate an up-to 32% uplift in performance that transfers across LLMs (GPT-4, GPT-3.</description></item><item><title>LLMs as classifiers</title><link>https://tomhipwell.co/blog/llms_as_classifiers/</link><pubDate>Fri, 09 Feb 2024 08:55:30 +0000</pubDate><guid>https://tomhipwell.co/blog/llms_as_classifiers/</guid><description>Lefteris Loukas, Ilias Stogiannidis, Odysseas Diamantopoulos, Prodromos Malakasiotis, Stavros Vassos | 2023 | Paper
When I&amp;rsquo;ve heard folks talking about AI strategy recently a common trope has been that things are moving so fast that it is better to hold off product investments until the pace of change slows and the stack starts to stabilise. Instead we should be focussing on the low hanging fruit from the productivity lifts of using chat assistants or GPTs.</description></item><item><title>January '24 Roundup</title><link>https://tomhipwell.co/blog/january_roundup/</link><pubDate>Tue, 30 Jan 2024 14:15:00 +0000</pubDate><guid>https://tomhipwell.co/blog/january_roundup/</guid><description>End of month one already so its time for a round up. Here&amp;rsquo;s a few different bits I found interesting in January:
Supervised fine-tuning (SFT), Niels Rogge All the steps in going from a base model to a useful assistant using supervised fine tuning. A slightly deeper run through of fine tuning Mistral-7B end to end, with the details coloured in - from hardware sizing to a run through the different PEFT approaches (PEFT is parameter efficient fine tuning, e.</description></item><item><title>Things I'd like to learn in 2024</title><link>https://tomhipwell.co/blog/2024_in_learning/</link><pubDate>Wed, 17 Jan 2024 13:19:00 +0000</pubDate><guid>https://tomhipwell.co/blog/2024_in_learning/</guid><description>I guess like everyone in 2023, I&amp;rsquo;ve thought a lot about LLMs, LMMs and all the rest of it. As an interested bystander and casual observer, I thought I&amp;rsquo;d stake out three things that I&amp;rsquo;m curious to learn more about during the course of 2024 as I try and get that bit closer to the edge. If you have similar thoughts, can correct the gaps in my reasoning or are further along the curve and can signpost me to some good reading on these themes, I&amp;rsquo;d love it.</description></item><item><title>Bug and Incident Severity Template</title><link>https://tomhipwell.co/blog/bug_and_incident_severity/</link><pubDate>Sat, 30 Dec 2023 10:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/bug_and_incident_severity/</guid><description>PreviewMarkdown Bug and Incident Severity Guide Principles Fast triage: We grade and prioritise the bug straight away. Bias to action: We have a strong bias towards action, we either resolve the issue or close it. Ask forgiveness, not permission: Everyone can make a decision, follow the guidelines, then tell others about your decision. Accountability: We’re all collectively accountable for keeping our bugs in a healthy state, no one person or team is responsible.</description></item><item><title>Friction Log Template</title><link>https://tomhipwell.co/blog/friction_log/</link><pubDate>Sat, 30 Dec 2023 10:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/friction_log/</guid><description>PreviewMarkdown Friction Logging: TEMPLATE Author:
Date:
Size: S | M | L
What is a Friction Log? A friction log is a type of UX experiment where the subject journals their feelings, thoughts, struggles, joys, and any other type of emotion. The point is to surface anything that gives the user discomfort or joy so the product or feature can improve.
Context Describe the persona of the person reviewing the feature, as well as what they were trying to accomplish.</description></item><item><title>Incident Report Template</title><link>https://tomhipwell.co/blog/incident_report/</link><pubDate>Sat, 30 Dec 2023 10:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/incident_report/</guid><description>PreviewMarkdown Incident Report: Lead Empty Analysis Date Empty Severity Empty Treatment Ticket Empty ⚠️ **Guidelines for Incident Reports** Fast and light: Keep the report concise, ≤ 500 words, less than an hour of your time. Be scrappy, delete sections in this template you don’t need. Aim to write up within 24-48 hours of the incident being resolved Just the facts: avoid unecessary jargon, write just enough to provide a clear understanding of what happened and when.</description></item><item><title>2023 In Review</title><link>https://tomhipwell.co/blog/2023_in_review/</link><pubDate>Thu, 21 Dec 2023 14:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/2023_in_review/</guid><description>2023 was an incredible year in our industry, so I thought I&amp;rsquo;d look back and share the things I&amp;rsquo;ve loved reading, watching, learning and doing this year.
Blogs The Github one on Copilot, a slow and high level reveal around how Copilot is put together. Also, Jaccard similarity ftw! LLM Patterns, Eugene Yan&amp;rsquo;s summary post back in the Summer described a bunch of reference architectures for an emerging field for the first time.</description></item><item><title>Modern code review: a case study at Google</title><link>https://tomhipwell.co/blog/modern_code_review/</link><pubDate>Thu, 07 Dec 2023 10:59:10 +0000</pubDate><guid>https://tomhipwell.co/blog/modern_code_review/</guid><description>Caitlin Sadowski, Emma Söderberg, Luke Church, Michal Sipko, Alberto Bacchelli | 2018 | Paper
Benchmarks for Code Review It&amp;rsquo;s handy as we talk about code review to use some benchmarks that anchor our expectations for review performance in data. The best that I know of are in a 2018 paper from Google: &amp;ldquo;Modern Code Review: A Case Study at Google&amp;rdquo;. I find these helpful when breaking down qualiative feedback about reviews as if you can get the data, you can start to get a feel for where improvements in your review process can come from.</description></item><item><title>What Great Looks Like</title><link>https://tomhipwell.co/blog/what_great_looks_like/</link><pubDate>Fri, 08 Sep 2023 14:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/what_great_looks_like/</guid><description>PreviewMarkdown What Great Looks Like: NAME - TITLE ⚠️ This is a work in progress, the steps to get this live are: (1) writes a first draft from this template, (2) we review it together in a 1:1 (3) we re-draft based on feedback (4) we share it with the organisation as a whole so everyone has clarity on your role and responsibilities. Why are we doing this? We’re trying to remove ambiguity, if we do the hard thing and get specific about what we expect from a role it can help us identify strengths (so we can work on amplifying them) and areas for improvement (so you can improve your skills).</description></item><item><title>The SPACE of developer productivity: There's more to it than you think</title><link>https://tomhipwell.co/blog/the_space_of_developer_productivity/</link><pubDate>Mon, 03 Jul 2023 19:01:30 +0000</pubDate><guid>https://tomhipwell.co/blog/the_space_of_developer_productivity/</guid><description>Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler | 2021 | Paper
Summary I read the devex paper by a few of the same team recently so I went back and read the orignal SPACE paper as well. This one is a bit more famous, it was widely syndicated on publication in 2021 and the ideas behind SPACE have seeped into the general engineering consciousness over the past couple of years.</description></item><item><title>Developer experience: What actually drives productivity?</title><link>https://tomhipwell.co/blog/dev_ex_what_actually_drives_productivity/</link><pubDate>Thu, 29 Jun 2023 18:01:30 +0000</pubDate><guid>https://tomhipwell.co/blog/dev_ex_what_actually_drives_productivity/</guid><description>Abi Noda, Margaret-Anne Storey, Nicole Forsgren, Michaela Greiler | 2023 | Paper
Summary Devloper Experience or DX focuses on the lived experience of developers and the points of friction they encounter in their everyday work. In addition to improving productivity, Developer experience can drive business performance through increased efficiency, product quality and employee retention.
Quite a few recent approaches in this space have focussed solely on engineering metrics to measure whether an engineering organisation can be classed as &amp;ldquo;elite&amp;rdquo;.</description></item><item><title>Product Manager Hiring at a European Deep Tech Startup</title><link>https://tomhipwell.co/blog/pm_experience/</link><pubDate>Tue, 16 May 2023 14:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/pm_experience/</guid><description>Hiring Product Managers for Deep Tech I was catching up with a peer earlier this week and towards the end of our call, they threw out an interesting question - if you were hiring a product manager (PM) for our team right now, what profile would you hire?
For context, the startup they worked for was a european deep tech startup. Their current team size is around 15, with 5-7 members of that team in engineering.</description></item><item><title>Quick Technical Decisions: Trade-off, Pay-off, Decision, Communication</title><link>https://tomhipwell.co/blog/quick_technical_decisions/</link><pubDate>Tue, 09 May 2023 12:55:56 +0000</pubDate><guid>https://tomhipwell.co/blog/quick_technical_decisions/</guid><description>A skill that a lot of engineers develop as they move towards a Senior level is the ability to make quick technical decisions verbally or in short form (say, a Slack post). Mastering this skill allows for triage: where do I need to pause and spend time working through or spiking the idea with peers, where can I just make the decision and keep moving under my own steam, and what should I set to one side for another time?</description></item><item><title>High Output Management</title><link>https://tomhipwell.co/blog/hom/</link><pubDate>Tue, 02 May 2023 00:00:00 +0000</pubDate><guid>https://tomhipwell.co/blog/hom/</guid><description>Andy Grove&amp;rsquo;s masterpiece on management and leadership was a book I reached for as I transitioned from a role as a senior IC to leading an engineering organisation for the first time. To improve my practice, I created this deck to break down the core lessons from the book and keep them at the forefront of my mind as I went about my day-to-day activities. The book provides guidance on effective communication, setting clear goals, and offers techniques for measuring and improving productivity.</description></item><item><title>About</title><link>https://tomhipwell.co/about/</link><pubDate>Fri, 28 Apr 2023 00:00:00 +0000</pubDate><guid>https://tomhipwell.co/about/</guid><description>I lead technology at Hofy - a Series B startup revolutioning how we equip the world&amp;rsquo;s talent. I&amp;rsquo;m a technology leader with 15 years of engineering and product development experience at hypergrowth startups, scaleups and Fortune 100 enterprises.
I live in South Devon, England with my wife Hannah, our two daughters Margot and Alma and our dog, Woody.</description></item><item><title>Copyleft Licenses</title><link>https://tomhipwell.co/blog/license_checking/</link><pubDate>Sun, 16 Apr 2023 11:10:56 +0000</pubDate><guid>https://tomhipwell.co/blog/license_checking/</guid><description>License Checking I&amp;rsquo;ve been through a couple of due diligence processes now (sales, funding rounds), and one question that always gets asked is whether we have any dependencies with a copyleft (viral) license in our codebases. This includes the full dependency tree for every dependency so it can be a bit of a scramble working this out (and fixing it) across many repos in a short space of time. Having been bitten by this a couple of times I&amp;rsquo;ve now learnt that it&amp;rsquo;s a good practice to get a basic license checker wired into your CI for each repo nice and early - stopping it from becoming a problem in the first place as you can fail the release if the license is viral.</description></item><item><title>1:1 Template</title><link>https://tomhipwell.co/blog/1to1/</link><pubDate>Mon, 27 Mar 2023 13:25:50 +0000</pubDate><guid>https://tomhipwell.co/blog/1to1/</guid><description>PreviewMarkdown XX &amp;lt;&amp;gt; TH Important Links Kick Off 1:1 Brag Document ⏫ Growth Tracker Growth We should be catching up on your growth every four weeks, our last growth chat was:
Never! First one planned for xx xx Minutes This is your agenda, so please maintain it on a week to week basis!
DD MM 202X Pulse check: what&amp;rsquo;s on your mind? Kick Off 1:1 Brag Document What are 1:1s for?</description></item><item><title>Technical Design Document</title><link>https://tomhipwell.co/blog/tdd/</link><pubDate>Fri, 10 Mar 2023 10:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/tdd/</guid><description>PreviewMarkdown TDD Template How to use this document (remove once read) Use it as a checklist to make sure you cover all bases, remove anything you don’t need.
Work fast and light. A good TDD is &amp;lt;1000 words and it includes one diagram (that’s two pages, not including the 525 words for the template).
Write it in the open (no private documents!). Once it’s in Ready state, share it with everyone.</description></item><item><title>Decision Document</title><link>https://tomhipwell.co/blog/decision/</link><pubDate>Wed, 08 Mar 2023 13:37:52 +0000</pubDate><guid>https://tomhipwell.co/blog/decision/</guid><description>PreviewMarkdown Decision Template Maximum 1500 words. Provide links to supporting documentation inline.
Fill in the RAPID header block. Write the Problem statement [Optional] draft the Context block Define the Principles we must follow in making this decision Define the Dimensions we will measure the options against Develop the Options Ensure the Context block has all necessary background information Write the Recommendation Delete these instructions Date Recommend Agree Perform Input Decide Status DRAFT Recommendation / Decision Start with the point.</description></item><item><title>VS Code</title><link>https://tomhipwell.co/blog/vscode/</link><pubDate>Fri, 10 Feb 2023 15:30:00 +0000</pubDate><guid>https://tomhipwell.co/blog/vscode/</guid><description>I&amp;rsquo;ve done a lot of technical hiring and a part of that process has always been a hands on paired programming exercise. For me, one of the traits of the better candidates is deep familiarity with their IDE or editor - they&amp;rsquo;ve spent time learning about their tool of choice and had a strong taste for the how and why of it&amp;rsquo;s configuration. Typically the candidates who checked, adjusted or discussed the configuration would finish the exercise much more quickly and comfortably, so it was normally a leading indicator early in the interview that things were going to go well.</description></item><item><title>Company Strategy</title><link>https://tomhipwell.co/blog/strategy/</link><pubDate>Mon, 05 Dec 2022 09:15:00 +0000</pubDate><guid>https://tomhipwell.co/blog/strategy/</guid><description>PreviewMarkdown Strategy Template What is strategy? Strategy is not a roadmap, it is a set of powerful choices that combine to position the company to win over the next n (e.g. 3-5) years. There are 5 key choices in Strategy.
What is our winning aspiration? Where will we play? How will we win where we have chosen to play? What capabilities must be in place to win? What management systems are required to ensure the capabilities are in place?</description></item><item><title>Superforecasting</title><link>https://tomhipwell.co/blog/superforecasting/</link><pubDate>Thu, 01 Dec 2022 09:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/superforecasting/</guid><description>In work (and life) we&amp;rsquo;re making (and receiving) predictions all the time - from when that all important project is going to ship to how a core business metric is going to move over the next few weeks via how long it might take to hire that critical role. Working to refine and understand this skill felt important, and Philip Tetlock&amp;rsquo;s book sets out a few core behaviours and techniques used by the very best.</description></item><item><title>Multipliers</title><link>https://tomhipwell.co/blog/multipliers/</link><pubDate>Fri, 18 Nov 2022 09:00:00 +0000</pubDate><guid>https://tomhipwell.co/blog/multipliers/</guid><description>Multipliers are leaders who amplify the intelligence and capabilities of their teams. This book is another resource I&amp;rsquo;ve reached for as I&amp;rsquo;ve transitioned into being an engineering leader. It explores techniques for unlocking the full potential of team members, fostering a culture of collaboration, and empowering individuals to contribute their best work. This can impact individual growth, innovation and problem solving ability. The deck summarises the key lessons from the book, so the techniques (e.</description></item><item><title>Flow</title><link>https://tomhipwell.co/blog/flow/</link><pubDate>Sat, 05 Nov 2022 15:30:00 +0000</pubDate><guid>https://tomhipwell.co/blog/flow/</guid><description>Mihaly Csikszentmihalyi&amp;rsquo;s book &amp;ldquo;Flow&amp;rdquo; explores the concept of flow, a state of optimal human experience where individuals are fully immersed and highly focused on an activity. Understanding flow, the structure of flow like experience and how to plan for and cultivate flow in your life and work can increase how much you enjoy activities like programming that are very conducive to flow experiences. We can use our understanding of flow to achieve increased focus, productivity and creativity in our work.</description></item><item><title>Regex</title><link>https://tomhipwell.co/blog/regex/</link><pubDate>Thu, 20 Oct 2022 14:45:00 +0000</pubDate><guid>https://tomhipwell.co/blog/regex/</guid><description>Early in my career watching other engineers pattern match across a codebase or some logs using regex felt like some kind of inaccessible, dark magic - they could find (and replace) what they were looking ten times faster than I could. If you&amp;rsquo;ve not spent time learning the ins and outs of regular expressions and committing them to memory then this is an easy way of boosting your productivity - they pop up everywhere you need a string operation and in every context.</description></item><item><title>Bash</title><link>https://tomhipwell.co/blog/bash/</link><pubDate>Sat, 15 Oct 2022 18:30:00 +0000</pubDate><guid>https://tomhipwell.co/blog/bash/</guid><description>Getting some &amp;rsquo;nix super powers is part of the secret sauce that most high performing engineers have, and this deck can help bootstrap that process, fill in gaps or retain muscle memory if you&amp;rsquo;re spending less time in the terminal than you used to. The majority of the cards are built based on notes from the book The Linux Command Line which is available for free under a creative commons license and well worth a read.</description></item><item><title>Oh-My-Zsh</title><link>https://tomhipwell.co/blog/ohmyzsh/</link><pubDate>Sat, 15 Oct 2022 18:30:00 +0000</pubDate><guid>https://tomhipwell.co/blog/ohmyzsh/</guid><description>Oh-my-zsh is another little bit of engineering secret sauce, though if you&amp;rsquo;re fresh to using the framework I&amp;rsquo;d probably carefully review alternatives and find something that closely fits your workflow before diving in. The deck summarises a bunch of handy aliases for different combinations of git, yarn, kubectl and python commands. Mileage might vary (I&amp;rsquo;d suggest building your own deck instead of using this one) but I include it here as an example as I&amp;rsquo;ve found that learning the aliases nudged me towards better practice as it changed my defaults and this in turn improved the quality of my work and my productivity.</description></item><item><title>Getting Things Done</title><link>https://tomhipwell.co/blog/gtd/</link><pubDate>Fri, 14 Oct 2022 10:54:39 +0000</pubDate><guid>https://tomhipwell.co/blog/gtd/</guid><description>Getting things done is so ubiquitous that it probably needs no introduction, but as my career developed and my home life grew more complex I needed to improve how I managed my time and that meant leaving behind some bad habits. This book and this deck helped with that transformation and while I&amp;rsquo;m now quite a few years down the line with that process I still keep working through these cards to keep the core principles of GTD front of mind, so I can use it in my own life and to coach others.</description></item><item><title>Security Review</title><link>https://tomhipwell.co/blog/security_review/</link><pubDate>Thu, 13 Oct 2022 16:59:37 +0000</pubDate><guid>https://tomhipwell.co/blog/security_review/</guid><description>PreviewMarkdown Security Review Checklist (TEMPLATE) Completed By: @yourself
Date: **@Today**
You should be able to find out the majority of the answers to these questions by browsing the vendor’s website. If you’re struggling then it may be worth scheduling a call with a sales rep and seeing if you can answer the questions. If you’re still having issues or this is a major account and you think it needs some extra scrutiny, then reach out to &amp;lt;who&amp;rsquo;s in charge here?</description></item><item><title>Product Requirements Document</title><link>https://tomhipwell.co/blog/prd/</link><pubDate>Tue, 27 Sep 2022 14:20:00 +0000</pubDate><guid>https://tomhipwell.co/blog/prd/</guid><description>PreviewMarkdown PRD: [PRD TEMPLATE] How to use this document Use it as a checklist to make sure you cover all bases, remove anything you don’t need.
Work fast and light. A good PRD is &amp;lt;1000 words and it includes diagrams (that’s two pages, not including the 525 words for the template).
Write it in the open (no private documents!). Once it’s in Ready state, share it with everyone via Slack.</description></item><item><title>Git</title><link>https://tomhipwell.co/blog/git/</link><pubDate>Sun, 28 Aug 2022 12:00:00 +0000</pubDate><guid>https://tomhipwell.co/blog/git/</guid><description>Helps with memorising git commands and flags to get you going a bit faster in the terminal. Knowing some of the ins and outs of git is good for getting into a flow state quicker when working (what I mean here is that grepping back through your bash history or googling for that command or flag you need doesn&amp;rsquo;t interrupt you, you can remain focussed on writing the code). Covers some deeper bits of git as well (i.</description></item><item><title>Kick Off 1:1</title><link>https://tomhipwell.co/blog/kickoff/</link><pubDate>Fri, 26 Aug 2022 13:02:28 +0000</pubDate><guid>https://tomhipwell.co/blog/kickoff/</guid><description>PreviewMarkdown Kick Off 1:1 This document is intended to be a first step together where we contract to understand what you need to get the most out of our 1:1 conversations. If you want to read more, Lara Hogan (who came up with the idea) has a good blog post on why Kick Off 1:1s are important.
Contract The contract helps us to make sure we’re all on the same page.</description></item><item><title>Brag Document</title><link>https://tomhipwell.co/blog/brag/</link><pubDate>Sat, 20 Aug 2022 13:13:38 +0000</pubDate><guid>https://tomhipwell.co/blog/brag/</guid><description>PreviewMarkdown Brag Document For a bit of background, Julia Evans has a nice write up on brag documents here. The idea is that this is a quick space for you to jot down your achievements on a week to week basis. We can work through the list together to find themes, work out the big picture of what you’re working on and celebrate your accomplishments. When we come to do a performance review, we don’t need to rely on our probably fuzzy memories - we’ll have a written record of your achievements.</description></item><item><title>Peak</title><link>https://tomhipwell.co/blog/peak/</link><pubDate>Sat, 18 Jun 2022 10:55:10 +0000</pubDate><guid>https://tomhipwell.co/blog/peak/</guid><description>Anders Ericsson&amp;rsquo;s masterpiece on how to acquire expertise through deliberate practice changed the way I approached learning - for example, by making my approach far more systematic and methodical. This deck summarises the core lessons of the book.</description></item><item><title>How to take smart notes</title><link>https://tomhipwell.co/blog/how_to_take_smart_notes/</link><pubDate>Thu, 26 May 2022 10:55:27 +0000</pubDate><guid>https://tomhipwell.co/blog/how_to_take_smart_notes/</guid><description>Sönke Ahrens | 2017 | Amazon | Goodreads
Why take notes? And why be structured and disciplined about how it is done? Academic success is not correlated to IQ north of 120, so what&amp;rsquo;s the secret sauce? There is no measurable correlation between a high IQ and academic success – at least not north of 120 This tallies with studies of Nobel prize winners, where the IQ range is given as >=120.</description></item></channel></rss>