Optimising Technology for Efficiency in Research - InDEx Ecosystem
A long time ago, websites used to be a bunch of files, and if you're lucky they were attached to a database. It was simple, you update the database and the site updates.
Planning how to utilise and optimise technology to help in the running InDEx was something I spent a lot of time thinking about. In this blog post, I wanted to share how I, with InDEx, utilise technology to make me more productive, automate processes and be efficient when conducting research. Technology is a great resource when used correctly, and it is not as expensive as you might think. With InDEx, it actually saves money by ensuring we provision servers to meet demand and not have them active all the time, or spend time trying to bring new infrastructure online.
My aim is to use technology to optimise my work to give me more time to undertake research and do the things that matter, whilst saving money.
That is the big picture, I do not focus on some of the specific such as machine learning in this post.
User Interface and InDEx API
When developing the front and backend user interface (UI), I went down the responsive web route with the objective of making everything work as nicely as possible across a range of devices and platforms. The UI is designed around CSS media queries and a customised Bootstrap to handle mobile devices with ease. The primary role of the project is to develop a mobile app, however I have also developed it to work across platforms.
The InDEx API is divided into two parts; front-end RESTful API and back-end management RESTful API. I use Express Sessions on the back-end and JWT tokens on the front-end to secure the enviorment. The API has been designed to allow for async requests and links into a MongoDB database.
I use a private GitHub repository to store all the source code and utilise branches to manage feature development. I utilise an auto deployment script, a version of which I have blogged about here. I do all my work exclusively on "Dev" branches and merge into the master branch. The GitHub auto deployment script kicks in and pulls the updates and deploys them onto the server. It is a seamless deployment, if the deployment fails for some reason I am notified and the deployment reverts back to the last stable release.
Cloud Storage and Hosting
To meet legal requirements I only use London based instances that are compliant with Privacy Shield, Data Protection Act and University regulations. This is a very important consideration that you need to consider from the outset of any project.
A single "Droplet" is used to host the website, it does not usually get high volumes of traffic and therefore we do not need anything complex. I would have liked to use DO for the entire stack, sadly they currently do not offer Load Balancing, however in a recent announcement they soon will be. This costs about $5 a month.
AWS provides the ability of Load Balancing, which is exactly what we need for InDEx. The load balancer is capable of scaling automatically based on traffic, or manually when required, and takes a total of about 2 minutes. The default setup can handle about 100 users concurrently using the InDEx platform. I also use AWS storage to host the MongoDB database to ensure abstraction away from the API infrastructure. Machine learning forms part of the InDEx App, and this sometimes requires me to spin up a GPU for processing, I only run this for a couple of hours at a time. The total costs about $10 a month depending on usage for the entire AWS.
Content Delivery Networks
I utilise the bandwidth of others, this means that I do not have to pay for it, by using public content delivery networks. I use CDN anywhere I can with the likes of jQuery, Font Awesome and Bootstrap. There is also a bonus to using CDN via a browser, if they've visited another site using the same version from the same CDN, it's already cached in their browser and there'll be no loading it. For local deployment within the mobile app a local version is served locally to handle offline capability.
Monitoring & Alerts
One of the hardest parts of developing the InDEx infrastructure has been deciding the best way to monitor users' interactions, backend processes and critical code failures.
Google Analytics is used to track user behaviour both in the app and online. I really like how they distill user behaviours and present the data.
Firebase is used to route all traffic, monitor users requests and abstract away the API end-points.
New Relic is used to monitor and optimize the entire technology stack — from infrastructure and applications to browser and mobile apps. New Relic has been key in tracking performance issues at a transactional and database level.
I use a couple of additional services in the InDEx project. I use Mandrill to manage the email exchanges, they offer a good package and a comprehensive dashboard. Authy is used to manage the two-step authentication of users and Twilio SMS is used to send/receive text messages.