tlmfoundationcosmetics.com

Creating a Personal Crypto Alert System: A Step-by-Step Guide

Written on

Chapter 1: Introduction to Personal Crypto Alerts

If you’re anything like me, your journey into cryptocurrency started recently, just as the market seemed to plateau and stopped making headlines about overnight millionaires. Now, you find yourself anxious, fearing that you might wake up to discover your investment has plummeted in value. Unlike the stock market, which has set trading hours, cryptocurrency can fluctuate at any moment.

Wouldn't it be great to receive alerts about significant price movements without the marketing noise from exchange notifications? It seems the only solution is to create a system ourselves. Let’s dive into the project.

The Objective

The aim of this endeavor is to set up an email notification system that alerts us whenever the price of specific cryptocurrencies changes by 5% or more within a day. In the future, I may expand this to include various timeframes, multiple currencies, and different percentage thresholds. However, for our minimum viable product (MVP) and learning purposes, this straightforward goal is an excellent starting point.

Rekindling the Passion

It’s not my responsibility to rescue you; that’s up to you. As a data professional, you shouldn’t find this task overly challenging. What’s that? You’ve only worked with datasets from your job and never attempted to collect, analyze, or alert on a dataset you actually care about? That’s unfortunate. When did you stop being a dreamer?

Modern data stacks can be both costly and tedious. Allow me to guide you through the process of establishing your own Personal Data Stack, enabling you to gather and analyze whatever you desire.

To achieve this, we will:

  1. Retrieve the latest cryptocurrency price data from an API every five minutes.
  2. Run a query every hour to calculate the percentage change for a specific coin.
  3. Send an email alert if a significant price change occurs.

There are numerous approaches to implement this, and various design choices will impact the system's cost, maintainability, and performance. I’ll clarify my rationale behind each decision and discuss the advantages and disadvantages of alternatives.

Let’s get started!

Step 1: Data Collection

The necessary data won’t simply appear at your doorstep; you need to collect it actively. For this project, we will utilize the CryptoCompare API to obtain the latest price quotes for the cryptocurrencies we wish to monitor.

Here’s an example of how to fetch the price of four currencies (BTC, ETH, REP, DASH) using Python, transform it into a DataFrame, and store it directly in S3 using the AWS Wrangler package:

Crypto price data fetching process

Scheduling the Script

Having a script to fetch data is fantastic, but scheduling it to run every five minutes is even better. To simplify my life, I decided to experiment with a new notebook tool called Hex. My main requirements were:

  • An intuitive IDE environment
  • A straightforward deployment and scheduling process
  • An easy way to access historical logs

Hex mostly meets these criteria, though there are some caveats. Notably, the ability to schedule jobs (referred to as apps) is a premium feature that costs $75 per month, which is a bit high for a personal budget. Additionally, the maximum scheduling frequency is once per hour. However, with a little creativity, I managed to run the data-fetching code every five minutes by wrapping it in a loop that executes ten times with a 300-second pause.

Here’s a conceptual overview of the code:

Pseudo-code for scheduling data fetch

Despite these limitations, the experience of developing and deploying Python code with Hex was quite enjoyable. Its remarkable features for SQL templating and creating DataFrames from SQL results will be covered in the Alerter notebook section.

To see the deployed script in the Hex app, click here!

Dealing with Small File Issues Using Delta Lake

Once the data retrieval script is operational, we’ll start to accumulate objects in S3.

Accumulating data objects in S3

While this is exciting, generating numerous small files can hinder query performance. This is a common challenge in object stores; we must manage these objects to ensure optimal performance. I have worked at companies that addressed this issue internally by implementing a Compaction Service, which ran Spark jobs to consolidate small files into larger ones. However, I do not recommend this approach.

Instead, I will utilize technologies designed to tackle this challenge, specifically Databricks' COPY INTO command to import data files into a Delta Lake table, followed by the OPTIMIZE command to compact them.

Here’s a screenshot of the commands I run as a Databricks Job every hour to maintain an optimized Delta table:

Optimizing Delta Lake table with Databricks commands

Now we have a single, larger file in the Delta Table, significantly improving performance for analysis as the original dataset grows to thousands, and eventually millions, of files.

Delta Lake performance comparison

Analyzing and Alerting on the Delta Table

After refreshing my SQL skills, I successfully wrote a query to calculate the daily percentage change from the crypto_prices_delta table:

SQL query for daily percentage change calculation

The final step is to run this query regularly to trigger alerts. Occasionally, the universe aligns with your objectives, and unexpected opportunities arise. For this project, that was the recent partnership announcement between Hex and Databricks, allowing you to configure a Databricks SQL endpoint as a data source in Hex.

After setting up the SQL Endpoint in Databricks and creating a data source in Hex, I was able to write queries against it. Two features in Hex made the experience enjoyable: SQL parameterization and DataFrame SQL. SQL parameterization allowed me to define today’s date in Python and incorporate it into my query using Jinja syntax. DataFrame SQL lets you execute a query and automatically transform the results into a Pandas DataFrame. After calculating the daily percentage change, I filtered the results to show only those exceeding a 5% change.

Here’s a preview of the Hex notebook:

Hex notebook for filtering results

Awesome! The final step involved raising an exception in the big_movers DataFrame if it contained any rows. I then scheduled the Hex notebook as an App to send an email alert to my inbox if it failed. Click here to view the complete notebook in Hex!

Looking Ahead

There are numerous ways to enhance this pipeline, starting with the additional flexibilities mentioned earlier. Nevertheless, I’m pleased that I was able to create a functional solution without managing any services myself and at a reasonable cost.

In terms of expenses, this setup costs around $12 daily for AWS, with an additional $6 for the Compaction Job and SQL Endpoint on Databricks.

Cost breakdown for Cryptolake Pipeline

When viewed in this light, $2.5 per day for Hex seems reasonable, although it’s tempting to consider alternatives that are free or self-hosted.

Thanks for reading! If you found this article valuable, consider following me on Twitter for future updates. I welcome your thoughts and comments below!

Chapter 2: Practical Video Tutorials

In this tutorial, learn how to create a cryptocurrency price tracking application using React JS and the CoinGecko API step by step.

Discover how to deploy your custom trading strategy without writing code by utilizing ChatGPT on Delta Exchange in India.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Dockerizing Your First Express.js App: A Comprehensive Guide

Step-by-step instructions to create and run an Express.js app using Docker for microservices development.

Mastering Yahoo Search Engine Scraping with R: A Guide

Learn how to efficiently scrape Yahoo search engine results using R and the rvest package for data extraction and analysis.

Revolutionizing Healthcare: The Impact of 3D Printing

Explore how 3D printing is transforming healthcare through customized prosthetics and implants, enhancing patient care and outcomes.

Essential Lessons from Writing on Medium

Discover key insights from writing on Medium, focusing on consistency, practice, and finding your niche.

Unlocking Your Partner's Heart: Harnessing the Hero Instinct

Discover how to deepen your connection by activating your partner's hero instinct and rekindling love.

Exploring the Evolution and Future of No-Code Platforms

A deep dive into the history and future prospects of no-code software solutions.

Exploring the Potential Discovery of Alien Technology

Harvard Professor Avi Loeb investigates material from interstellar objects, possibly revealing insights into alien technology.

Understanding Life's Cycles: Transforming Your Journey

Explore the impactful cycles in life, how to break negative patterns, and foster positive changes in relationships and careers.