Using Cloudflare D1

Feb. 26, 2023, 7:02 a.m.

Cloudflare D1 is a great way to quickly create and work with SQLite databases where a larger PostgreSQL or MySQL don't make sense. These are some example to quickly work with D1.

Create the Database and Table

wrangler d1 create data-cloud-vendors
wrangler d1 execute data-cloud-vendors --command='CREATE TABLE Customers (CustomerID INT, CompanyName TEXT, ContactName TEXT, PRIMARY KEY (`CustomerID`));'

Update Data

wrangler d1 execute data-cloud-vendors --command='SELECT * FROM Customers' --json

Download, Edit Locally, Upload

wrangler d1 backup create data-cloud-vendors
wrangler d1 backup download data-cloud-vendors <backup-id>
sqlite3 file.sqlite3 .dump > schema.sql
# Add a drop table if exists
wrangler d1 execute data-cloud-vendors --file=schema.sql

And there is Azure coming from behind...

Nov. 17, 2022, 5:49 a.m.

The Bay Area startup tech stack is MacBooks, Google Workspace, Slack, iPhones, and either AWS or Google Cloud. The rest of the world seems to be Microsoft Windows, Microsoft Office, Microsoft Teams, Android, and an on-site Sharepoint server. AWS has the most to lose as Azure catches up.

My wife recently needed Parallels with Windows installed on her MacBook to use Arcgis. I thought, what the hell, and decided to install Parallels on my own machine because I’ve heard so much about how much better Excel for Windows is than the Mac version. (Yes, I got excited about Excel, so sue me...) So I did it. And having played with Windows for the first time in a decade and a half I have to say I finally get Microsoft’s strategy after seeing this parallel universe.

Microsoft is playing a long game. But their game is to tie everything, and I mean everything, to Microsoft Azure. GitHub, Office, Excel, VSCode, Windows, the Power Platform. Everything at Microsoft seems to have a long game of connecting to Azure consistently. Excel pulls data from Azure, making it an alternative to tools like Tableau. GitHub Actions use Azure for compute,.VSCode seems to be connecting more and more to Azure for easy deployments. Windows seems to have easy corporate deployment options via Active Directory on Azure.

If you are in the Bay Area bubble with the Apple, Google, and AWS tech stack, we may be missing out on one of the significant technological shifts. I am betting the winner, in the long run, will be Microsoft. Microsoft has a huge distribution advantage. Say what you will about Steve Ballmer, but he built a high-power enterprise sales team at Microsoft. Buying a single unified package from Microsoft will, over time, be cheaper than buying piecemeal software from different vendors. This is why Slack lost. But everyone in the bay was scratching their head at why Slack lost because we were looking at Google as the 800-pound gorilla, not Microsoft, which is now the 1200-pound gorilla.

So what is the long-term trajectory? I think from a technological standpoint, Azure will consistently be behind AWS. Microsoft is a close follower, not a leader. So if you want the newest, then AWS will still likely be the primary Cloud provider to use. However, if your company is conservative and doesn’t care about newness, then Microsoft will be just fine. There will be deals put in place that give companies both Azure, Office, and Teams at a rate below what others are offering, and companies will pay for it.

This is all speculative, of course, and Amazon being one of the most innovative companies of our generation, will hopefully give Microsoft a run for its money. But at this point, the two Clouds I am betting on for production, compliance-oriented workloads are Azure then AWS.

Windows Based Crawler

Nov. 15, 2022, 4:31 a.m.

I like Excel for Windows. The Mac version is a joke compared to what the full-blown Windows version can do with data analysis and data finagling right from the app itself. A lot of what I have been working on as of late has been trying to get data into Excel stored on OneDrive with data crawled using Playwright. The reason for this is that some of the data is small enough that building a full database isn't necessary, and is not normalized enough to just use PowerQuery.

To achieve this outcome I have used Github Actions to trigger the run. Github Actions triggers on a schedule which sends the task to Github Runner which startups a Python script. Since Github Actions has access to the root volume on the Mac Mini (don't worry the machine is dedicated to just Github Actions) I can use xlwings to launch Excel to update. Once completed it just copies the file into OneDrive or Dropbox for me to access elsewhere.

There is absolutely no difference between the hosted runner and the self-hosted runner for this example other than that it just runs on a self-hosted instance that happens to have Excel on it:

  name: Download and Upload
	    - cron: "0 1 * * *"
	      - main

	    runs-on: self-hosted
	      - uses: actions/checkout@v3
	      - name: Install Dependencies
	        run: |
	          pyenv global 3.11
	          pip3 install -r ./requirements.txt
	      - name: Combine
	        run: |
	          python ./