Django Everywhere

Apr 30, 2023

Our goal with Workflows & Automations includes a standardization on Python as our backend language of choice. We wanted to standardize on a common web framework as well. While Flask and FastAPI are the most popular for APIs they tend to have a no-batteries included approach leading to N+1 ways of building software. However, opsZero is building an opinionated stack and this requires standardization and we have chosen Django for our Web Framework.

Our entire business is already reliant on Django as the primary framework so we have extensive experience using it, and absolutely love the built-in templating and ORM with migrations. Lastly, with ASGI a lot of extra features for WebSockets and Events are built into the framework using channels. The built in functionality along with years of Django experience means we can provide Django expertise quickly.

We want to meet our customers needs regardless of the Cloud they are using and with the lowest cost possible so we are making available pre-build templates to use Django in both Serverless as well as Kubernetes environments. We are releasing three templates:

These three templates allow us to deliver value to you faster.

Workflows & AI

Elon Musk's Engineering Principles

Apr 16, 2023

Think what you want of Elon Musk, but he has achieved quite a bit in engineering novel solutions to complex problems. We've worked mainly on implementing the same process to great effect in what we do. The principles are:

  1. Fix dumb requirements. Each requirement has a specific owner.
  2. Remove unnecessary parts
  3. Simplify/Optimize
  4. Speed up cycle time
  5. Automate

You can watch him describe his process here:

This is how we think about how to apply these principles as it applies to opsZero.

Fix dumb requirements.

When solving a problem for a customer the customer may not actually know what they need. So uncover the actual requirement behind the request. Usually, a problem such as the production database is high CPU and clients can't connect may actually be a root cause issue of the production database is being used to replicate data to a data warehouse which is causing the issue. This root cause analysis can be gleaned through a five whys analysis.

Second, with these requirements there needs to be a clear owner responsible for the issue. If there is not an owner for something then that itself is an issue. Ownership of each component means that someone exists to optimize each piece.

Remove unnecessary parts

Systems over time become complex. Pieces are added that don't need to exist. Or they were added then most likely someone forgot about. Systems over time should get less complex not more so. As we build things we build to get the task done. This means we may add complexity to the system that didn't need to exist but because we are pathfinding our way to the solution that complexity is needed. Once we get to the point of the system working as needed we go back and remove the pieces that are not needed.

Simplify/Optimize

After removing unnecessary parts, there may still be complexity within the current components. To simplify these components, we need to reduce variability and increase standardization. The use of if-else blocks to account for variability can increase complexity. However, simplification by reducing variability requires a subjective decision on the optimal approach. Therefore, it's best to initially build with some variability and refine through A/B testing over time towards the optimal solution.

Speed up cycle time

Once an optimal approach is found, remove variability to have a standardized approach for deliverability. This leads to faster outcomes with fewer if-else blocks creating flow.

Automate

Lastly, automate the processes such that things happen without intervention.

Follow up, Follow Up, Follow Up

Mar 23, 2023

From the moment a potential customer contacts us, we treat them as a valued customer, regardless of whether they have given us any money. This approach aligns with our primary work principle: Customer First. Our success depends on being the best alternative available to the customer. Given the numerous options at their disposal, it is crucial that we provide exceptional service from the moment they engage with us.

When a prospective customer reaches out, we commit to doing everything possible to help them succeed and to act in their best interest. We earn their business by diligently answering questions, consistently following up to stay top of mind, offering guidance throughout their decision-making process (even if it leads them to choose another provider), and working to build trust. Our business model relies on trust, which must be earned rather than simply granted. To entrust us with something as sensitive as their cloud infrastructure, customers need to know that we are their partners from the very beginning.

We may not always secure a customer, and frankly, we do not aim to convert every prospect. However, by conscientiously sharing our expertise, we also improve ourselves by gaining valuable information to enhance our own capabilities.

Keep the following guidelines in mind:

  1. Follow up continuously. Touch base after meetings and a few days later with open-ended questions. Maintain regular communication so the prospect has us top of mind. Simple messages, such as "Just following up to see if you have any additional questions?" can be effective. We should never let more than half a week pass without communication.
  2. If a prospect needs more time, offer assistance with their decision-making process. Much of our work is specialized and carries long-term consequences, so we must find ways to guide potential customers in their decisions. Whether it involves connecting them with our existing clients or conducting research on competitors and comparing pros and cons, we help. This process ultimately benefits us by revealing areas for improvement.

Our high-touch customer service begins the moment a prospect contacts us.

Principles

Capability Based Growth

Mar 16, 2023

We use our capabilities in one market and enter a new market with the same expertise. We tailor the marketing and sales to the particulars of the new sector, but our beachhead is to sell what we have already built for an existing vertical. The process is as follows:

  1. If we enter a new industry, we may not clearly understand how things work. Trying to develop a new business model causes context-switching problems and a loss of momentum and focus. Selling products and services we can already deliver but to a new vertical gets us in the door. 

  2. Once we are in the door, we use that as a learning opportunity to understand the vertical’s needs and build our mental models for how the industry works. The best way to learn about an industry is directly from the customers. What are their needs? What are their pain points? What are the tools they are already using?

  3. Once we understand the customer, we can build products specific to their needs. This includes developing partner relationships, distribution channels, and tuck-in acquisitions. By building after understanding, we reduce the waste in building new solutions.

  4. The last step is to repeat this process to a new vertical.

This approach ensures that the focus is delivering value to the customer from the start as opposed to meandering around.

Principles

Using Cloudflare D1

Feb 26, 2023

Cloudflare D1 is a great way to quickly create and work with SQLite databases where a larger PostgreSQL or MySQL don't make sense. These are some example to quickly work with D1.

Create the Database and Table

wrangler d1 create data-cloud-vendors
wrangler d1 execute data-cloud-vendors --command='CREATE TABLE Customers (CustomerID INT, CompanyName TEXT, ContactName TEXT, PRIMARY KEY (`CustomerID`));'

Update Data

wrangler d1 execute data-cloud-vendors --command='SELECT \* FROM Customers' --json

Download, Edit Locally, Upload

wrangler d1 backup create data-cloud-vendors
wrangler d1 backup download data-cloud-vendors <backup-id>
sqlite3 file.sqlite3 .dump > schema.sql
# Add a drop table if exists
wrangler d1 execute data-cloud-vendors --file=schema.sql

PostgreSQL Troubleshooting

Feb 22, 2023

Logical Replication

Logical Replication is how data is primarily replicated.

SELECT name,setting FROM pg_settings WHERE name IN ('wal_level','rds.logical_replication');

Inactive Replication

If logical replication is not running, then it is wasting space, and we need to ensure that it is dropped if it is not in use.

SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn)) AS replicationSlotLag, active FROM pg_replication_slots ;

# SELECT pg_drop_replication_slot('Your_slotname_name');

Disk Usage

Check what tables are taking up storage space.

SELECT datname, temp_files AS "Temporary files",pg_size_pretty(temp_bytes) AS "Size of temporary files" FROM pg_stat_database ;

SELECT
    pg_database.datname,
    pg_size_pretty(pg_database_size(pg_database.datname)) AS size
    FROM pg_database;

Understand the Sizes of Indexes

SELECT datname, temp_files AS "Temporary files",temp_bytes AS "Size of temporary files" FROM pg_stat_database ;

DataOps

Cloud Storage AWS S3 vs Azure Storage vs Cloudflare R2

Feb 20, 2023

What storage provider should I use?

It seems more and more that Cloudflare R2 is the best option for most users that don't have strict compliance requirements. With its S3-compatible API, it is easy to migrate existing data and integrate into your existing applications.

What are other considerations when choosing between storage providers?

You may want to consider the usage pattern of your own application. For example, if your app is constantly reading and writing to S3 you may get a better performance with S3 as Cloudflare R2 requests will require reaching outside of the AWS network.

What is the pitfalls of using Cloudflare R2 vs AWS S3?

One of the major considerations of whether to use S3 or R2 is that R2 is a newish platform while S3 has been around for 17 years (as of 2023). There is a lot of knowledge and AWS provides as lot of different ways to optimize S3 for your workloads. R2 may require certain rearchitecture of your platform to use well.

DataOps

Using PostgreSQL

Feb 14, 2023

Connect to a database through a bastion

SSH\_PRIVATE\_KEY=~/.ssh/id_rsa

RDS\_DATABASE\_HOST=opszero-database.aasdasd.us-east-1.rds.amazonaws.com
RDS\_DATABASE\_PORT=5432
RDS\_DATABASE\_USERNAME=postgres
RDS\_DATABASE\_PASSWORD=postgres
RDS\_DATABASE\_DB=postgres_development

BASTION\_USERNAME=ubuntu
BASTION\_HOST=137.32.32.83

ssh -i ${SSH\_PRIVATE\_KEY} -f -N -L ${RDS\_DATABASE\_PORT}:${RDS\_DATABASE\_HOST}:${RDS\_DATABASE\_PORT} ${BASTION\_USERNAME}@${BASTION\_HOST} -v

# In another terminal

psql "postgresql://${RDS\_DATABASE\_USERNAME}:${RDS\_DATABASE\_PASSWORD}@127.0.0.1:5432/${RDS\_DATABASE\_DB}"


Dump and Restore

pg\_dump 'postgresql://postgres:[email protected]:5432/db' > backup.sql
psql 'postgresql://${RDS\_DATABASE\_USERNAME}:${RDS\_DATABASE\_PASSWORD}@34.29.235.84:5432/restored_db -f backup.sql

Create a User

CREATE USER newuser123 WITH PASSWORD 'foobar123';

# Read only access
GRANT CONNECT ON DATABASE database_name TO newuser123;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO newuser123;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO newuser123;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO newuser123;

# Grant all on database
GRANT ALL PRIVILEGES ON DATABASE database_name TO newuser123;

# Turn User into Super User
ALTER USER newuser123 WITH SUPERUSER;

Useful Stats

-- show running queries (pre 9.2)
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg\_stat\_activity%'
ORDER BY query_start desc;

-- kill running query
SELECT pg_cancel_backend(procpid);

-- kill idle query
SELECT pg_terminate_backend(procpid);

-- vacuum command
VACUUM (VERBOSE, ANALYZE);

-- all database users
select * from pg_stat_activity where current_query not like '<%';

-- all databases and their sizes
select * from pg_user;

-- all tables and their size, with/without indexes
select datname, pg_size_pretty(pg_database_size(datname))
from pg_database
order by pg_database_size(datname) desc;

-- cache hit rates (should not be less than 0.99)
SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit)  as heap_hit, (sum(heap_blks_hit) - sum(heap_blks_read)) / sum(heap_blks_hit) as ratio
FROM pg_statio_user_tables;

-- table index usage rates (should not be less than 0.99)
SELECT relname, 100 * idx_scan / (seq_scan + idx_scan) percent_of_times_index_used, n_live_tup rows_in_table
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;

-- how many indexes are in cache
SELECT sum(idx_blks_read) as idx_read, sum(idx_blks_hit)  as idx_hit, (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit) as ratio
FROM pg_statio_user_indexes;

DevOps

Deploying to Cloudflare Pages using Github Actions

Feb 14, 2023

Cloudflare provides a great CDN with no egress charges on bandwidth. The best way to use Cloudflare is through Cloudflare Pages. Using Cloudflare Pages should be pretty straightforward for most frameworks that generate a SPA. However, see the example below for how to use Cloudflare Pages from asset pipelines for Ruby on Rails and Django.

Here is an example of using Github Actions to publish Django Static Files

  - name: Build Static Files
   run: |
    docker run --env STATIC_ROOT='/static-compiled/' \
          --env DATABASE_URL='sqlite:///db.sqlite' \
          -v $PWD/static:/app/static -v $PWD/static-compiled:/static-compiled \
          $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \
          python manage.py collectstatic --noinput

  - name: Publish Static Files
   uses: cloudflare/[email protected]
   with:
    apiToken: ${{ secrets.CF_API_TOKEN }}
    accountId: ${{ secrets.CF_ACCOUNT_ID }}
    command: pages publish ./static-compiled --project-name=opszero-static --commit-dirty=true

DevOps

Setting Github Secrets

Feb 14, 2023

Run the following within your repo:

gh secret set nameofsecret "Secret"

This will update the secret to be made available to Github Runners. Alternatively, you can go to the web interface: https://github.com///settings/secrets/actions to update the variable.

DevOps
1   next