Where Should I Put Business Logic: Frontend, Backend, or Database?
Let's consider an application where we have 100 internet-connected devices, each with 5 sensors that send a new reading to the server every minute. On a webpage, we want to display a list of all the devices, and an average value for each sensor for the last 24 hours. Let's consider our options:
1. Front End
First, we can do one query to retrieve all the device records from the database, joining the last 24 hours of data to each device. Then, we send all of this data to the front end. In the front end, we loop through each device, grab its corresponding data, and then calculate the average reading for each sensor.
Let's query the device records and data just like before, only this time, we loop through each device, and calculate the averages on the backend. Then we only send the averages to the front end.
In this option, we can do a single database query, which will retrieve all the device records, joined with their data, and then we use the database average function to calculate the averages. Then we return this data to the front end.
Now let's back up and look at these options. First, if we want to get an average value for the last 24 hours for 100 devices with 5 sensors each sending readings every minute, that is 24hr * 60min/hour = 1440 minutes worth of data. For 100 devices, 5 sensors each, that is 500 * 1440 = 720k data records. Let's assume that each record is 5 64 bit values and a key of about 10 characters (10 bytes) for each reading. This means each record is 10 * 5 (for keys) + 8 * 5 (for values) = 90 bytes. In total that is 64.8MB of data.
Here are some considerations and general rules to help you decide where to put it:
- User Experience: Reducing the number of API calls will reduce latency. You can upgrade the server, but you can't upgrade the client.
- Security: Never trust the client. Client code can be changed, new code can be written, bad data can be sent.
- Cost: More processing and memory requirements increase server costs.
- Maintainability: Put the code where it is less prone to bugs. Don't repeat yourself.
Rules of Thumb
- Minimize the amount of data transferred from the database to the application
- Scale-up on the database, scale-out in the application
- Minimize disk I/O by only looking up data required
Generally, the fastest as it is the source of the data. Best for data integrity (enforcing uniqueness, non-null, foreign keys, data types, relatively simple calculations, etc. Can be less developer-friendly, but can also be less prone to bugs if done correctly. Slow queries will affect the speed of other users' database queries, and ultimately server performance. Can upgrade to increase performance.
Affects the performance of all users. Better for more intensive processes, or processes that require interaction with other services. More secure than the front end. Can upgrade to increase performance.
Good for offloading processing from the server if it doesn't reduce usability. Less latency if you can avoid an API request. Best for processing related to the display of data, which could vary based on the client. Only affects the performance of one specific client. Generally, if a process needs to be secure, do it in the backend. Assume the code on the front end can be changed. Cannot count on being able to upgrade to increase performance.
Let's say a user wants to book an appointment on your website. They fill out a form and submit their information and the date of the appointment.
- Validate it before submitting. It is faster, as there is no need to do an API call with bad data.
- Also validate it on the backend for security reasons. Do not trust the client to enforce data validation.
- Validate it where possible in the database (e.g uniqueness and type constraints). Less prone to bugs where application logic might fail to enforce.
Large Data Sets
Let's say you have a large dataset, one row for every minute over the last year. You want to chart the data and calculate some metrics such as average, max, min, and median.
- There are over 1/2 Million records, this could easily be tens of megabytes. This is a lot of data to load into the client's memory.
- Also iterating over 1/2M records to sample the data, and calculating these metrics is CPU intensive and could cause not only the webpage, but the entire client machine to become sluggish.
- You don't need to chart every single data point, only a sample. So sending all the data over the network is slower and potentially costly (mobile data).
- Process the data on the server, if possible, and offload the memory and CPU-intensive tasks to a larger machine (server). This will also reduce the number of times that data has to be loaded into memory.