Eli Skeggs

is a Software Engineer

Read this first

S3 is a hobbyist database

AWS and other Cloud Service Providers (CSPs) provide many sophisticated services for managing, storing and querying data. These services excel in meeting the diverse needs of large enterprises. However, these services end up far exceeding the capabilities and budgets of the hobbyist service engineer. For small-scale projects, web scale data storage is largely unnecessary, and massive fault-tolerance exorbitant.

The hobbyist looks instead for low-cost, serverless data storage and retrieval solutions, while also aiming to minimize management overhead. Minimization of management overhead is crucial for project success: a hobbyist does not want to become a full time Site Reliability Engineer (SRE) on a project with no revenue and two users! While projects can sometimes afford a budget of $5/mo (enough for a particularly inexpensive AWS EC2 instance), the time-cost to manage the filesystem...

Continue reading →


Honesty and Artificial Intelligence

Suppose we create an artificial intelligence capable of producing a logical, functional proof for some claim about the physical world. Such an AI would need to have first modeled enough language to understand our messy world in recognizable terms. It would further need to know enough to propose statements about our world, and need to grasp enough logic to derive new statements from given statements and rules.

Assume that the AI mechanically produces natural language proofs, and that its skills and reasoning do not translate to other domains. That is, it has not sufficiently generalized its understanding for us to consider it sentient. Machine learning tools could potentially produce such an “AI,” and given the limitations placed on it, we could decide that such an AI lacks higher intelligence, and is therefore a tool of its user. The user, after all, must provide context for any proof...

Continue reading →


Time Representations and Decimal Years

Titles are cool.

I wanted to represent a timespan in terms of decimal years, for no apparent reason. This isn’t the most interesting thing I have to talk about, to be fair.

Anyway, I noticed that there’s a variety of ways to calculate this timespan, specifically the non-integer portion. As always, there’s a naïve way to compute the entire number, namely calculate the difference in milliseconds (or some similarly small unit of time), and divide by the number of milliseconds in a year. How many milliseconds are in a year? That’s a silly question, because years don’t have a constant length. We could use the average, but we’d generally be off.

So we know the root issue is that years have a non-constant length. What does a decimal year represent, then? Just like everything else about our measurement of time, it’s an arbitrary representation. I think it makes the most sense to make the...

Continue reading →


Databases

Our databases are inflexible. Our databases are designed for specific tasks, even if they’re intended to be general purpose. They make decisions for us, choosing how consistency, availability, and partition tolerance fit together. By design, we first select our databases around the data we want to store, rather than basing our decision on their maturity and features.

Normalized databases lend themselves to many different kinds of data, but rigidly enforce a strong consistency policy. Denormalized databases generally relinquish consistency in favor of availability. Some more special-use data stores like Redis prefer consistency and pure data structures, but ultimately don’t scale well, and sacrifice durability in a wild dash for raw speed.

We need a database that gets out of our way, and provides the flexibility we need to get the job done, along the way minimizing resources spent on...

Continue reading →


Security and Passwords

Security is hard, reflected by the increasing number of security breaches at high-profile services. Most services use a username and password authentication scheme. The username represents identity, and the password verifies that identity. Developers and consumers alike find passwords difficult to manage. Developers struggle to build systems which securely handle and store passwords, while consumers fail to create sufficiently memorable and random passwords.

There has to be a better way.

However, if we find a better way to identify and authenticate users, it will take time for most services to catch on. In the meantime, how do we create better passwords? The best passwords should roll off the fingers much like an elegant word rolls off the tongue. We humans are good at optimizing movement, provided the movement can be optimized, so why not use that knowledge to create better passwords?

...

Continue reading →