Connect with us

Big Data

Automating Every Aspect of Your Python Project

Avatar

Published

on

Automating Every Aspect of Your Python Project

Every Python project can benefit from automation using Makefile, optimized Docker images, well configured CI/CD, Code Quality Tools and more…


By Martin Heinz, DevOps Engineer at IBM

Image for post

Every project — regardless of whether you are working on web app, some data science or AI — can benefit from well configured CI/CD, Docker images that are both debuggable in development and optimized for production environment or a few extra code quality tools, like CodeClimate or SonarCloud. All these are things we will go over in this article and we will see how those can be added to your Python project!

This is a follow up to previous article about creating “Ultimate” Python Project Setup, so you might want check that out before reading this one.

TL;DR: Here is my repository with full source code and docs: https://github.com/MartinHeinz/python-project-blueprint

Debuggable Docker Containers for Development

 
Some people don’t like Docker because containers can be hard to debug or because their images take long time to be built. So, let’s start here, by building images that are ideal for development — fast to build and easy to debug.

To make the image easily debuggable we will need base image that includes all the tools we might ever need when debugging — things like bash, vim, netcat, wget, cat, find, grep etc. python:3.8.1-buster seems like a ideal candidate for the task. It includes a lot of tools by default and we can install everything what is missing pretty easily. This base image is pretty thick, but that doesn’t matter here as it’s going to be used only for development. Also as you probably noticed, I chose very specific image – locking both version of Python as well as Debian – that’s intentional, as we want to minimize chance of “breakage” caused by newer, possibly incompatible version of either Python or Debian.

As an alternative you could use Alpine based image. That however, might cause some issues, as it uses musl libc instead of glibc which Python relies on. So, just keep that in mind if decide to choose this route.

As for the speed of builds, we will leverage multistage builds to allow us to cache as many layers as possible. This way we can avoid downloading dependencies and tools like gcc as well as all libraries required by our application (from requirements.txt).

To further speed things up we will create custom base image from previously mentioned python:3.8.1-buster, that will include all tool we need as we cannot cache steps needed for downloading and installation of these tools into final runner image.

Enough talking, let’s see the Dockerfile:

Above you can see that we will go through 3 intermediate images before creating final runner image. First of them is named builder. It downloads all necessary libraries that will be needed to build our final application, this includes gcc and Python virtual environment. After installation it also creates actual virtual environment which is then used by next images.

Next comes the builder-venv image which copies list of our dependencies (requirements.txt) into the image and then installs it. This intermediate image is needed for caching as we only want to install libraries if requirements.txt changes, otherwise we just use cache.

Before we create our final image we first want to run tests against our application. That’s what happens in the tester image. We copy our source code into image and run tests. If they pass we move on to the runner.

For runner image we are using custom image that includes some extras like vim or netcat that are not present in normal Debian image. You can find this image on Docker Hub here and you can also check out the very simple Dockerfile in base.Dockerfile here. So, what we do in this final image – first we copy virtual environment that holds all our installed dependencies from tester image, next we copy our tested application. Now that we have all the sources in the image we move to directory where application is and then set ENTRYPOINT so that it runs our application when image is started. For the security reasons we also set USER to 1001, as best practices tell us that you should never run containers under root user. Final 2 lines set labels of the image. These are going to get replaced/populated when build is ran using make target which we will see a little later.

Optimized Docker Containers for Production

 
When it comes to production grade images we will want to make sure that they are small, secure and fast. My personal favourite for this task is Python image from Distroless project. What is Distroless, though?

Let me put it this way — in an ideal world everybody would build their image using FROM scratch as their base image (that is – empty image). That’s however not what most of us would like to do, as it requires you to statically link your binaries, etc. That’s where Distroless comes into play – it’s FROM scratch for everybody.

Alright, now to actually describe what Distroless is. It’s set of images made by Google that contain the bare minimum that’s needed for your app, meaning that there are no shells, package managers or any other tools that would bloat the image and create signal noise for security scanners (like CVE) making it harder to establish compliance.

Now that we know what we are dealing with, let’s see the production Dockerfile… Well actually, we are not gonna change that much here, it’s just 2 lines:

All we had to change is our base images for building and running the application! But difference is pretty big — our development image was 1.03GB and this one is just 103MB, that’s quite a difference! I know, I can already hear you — “But Alpine can be even smaller!” — Yes, that’s right, but size doesn’t matter that much. You will only ever notice image size when downloading/uploading it, which is not that often. When the image is running, size doesn’t matter at all. What is more important than size is security and in that regard Distroless is surely superior, as Alpine (which is great alternative) has lots of extra packages, that increase attack surface.

Last thing worth mentioning when talking about Distroless are debug images. Considering that Distroless doesn’t contain any shell (not even sh), it gets pretty tricky when you need to debug and poke around. For that, there are debug versions of all Distroless images. So, when poop hits the fan, you can build your production image using debug tag and deploy it alongside your normal image, exec into it and do – for example – thread dump. You can use the debug version of python3 image like so:

Single Command for Everything

 
With all the Dockerfiles ready, let’s automate the hell out of it with Makefile! First thing we want to do is build our application with Docker. So to build dev image we can do make build-dev which runs following target:

This target builds the image by first substituting labels at the bottom of dev.Dockerfile with image name and tag which is created by running git describe and then running docker build.

Next up — building for production with make build-prod VERSION=1.0.0:

This one is very similar to previous target, but instead of using git tag as version, we will use version passed as argument, in the example above 1.0.0.

When you run everything in Docker, then you will at some point need to also debug it in Docker, for that, there is following target:

From the above we can see that entrypoint gets overridden by bash and container command gets overridden by argument. This way we can either just enter the container and poke around or run one off command, like in the example above.

When we are done with coding and want to push the image to Docker registry, then we can use make push VERSION=0.0.2. Let’s see what the target does:

It first runs build-prod target we looked at previously and then just runs docker push. This assumes that you are logged into Docker registry, so before running this you will need to run docker login.

Last target is for cleaning up Docker artifacts. It uses name label that was substituted into Dockerfiles to filter and find artifacts that need to be deleted:

You can find full code listing for this Makefile in my repository here: https://github.com/MartinHeinz/python-project-blueprint/blob/master/Makefile

CI/CD with GitHub Actions

 
Now, let’s use all these handy make targets to setup our CI/CD. We will be using GitHub Actions and GitHub Package Registry to build our pipelines (jobs) and to store our images. So, what exactly are those?

  • GitHub Actions are jobs/pipelines that help you automate your development workflows. You can use them to create individual tasks and then combine them into custom workflows, which are then executed — for example — on every push to repository or when release is created.
  • GitHub Package Registry is a package hosting service, fully integrated with GitHub. It allows you to store various types of packages, e.g. Ruby gems or npm packages. We will use it to store our Docker images. If you are not familiar with GitHub Package Registry and want more info on it, then you can check out my blog post here.

Now, to use GitHub Actions, we need to create workflows that are going to be executed based on triggers (e.g. push to repository) we choose. These workflows are YAML files that live in .github/workflows directory in our repository:

In there, we will create 2 files build-test.yml and push.yml. First of them build-test.yml will contain 2 jobs which will be triggered on every push to the repository, let’s look at those:

First job called build verifies that our application can be build by running our make build-dev target. Before it runs it though, it first checks out our repository by executing action called checkout which is published on GitHub.

The second job is little more complicated. It runs tests against our application as well as 3 linters (code quality checkers). Same as for previous job, we use checkout@v1 action to get our source code. After that we run another published action called setup-python@v1 which sets up python environment for us (you can find details about it here). Now that we have python environment, we also need application dependencies from requirements.txt which we install with pip. At this point we can proceed to run make test target, which triggers our Pytest suite. If our test suite passes we go on to install linters mentioned previously – pylint, flake8 and bandit. Finally, we run make lint target, which triggers each of these linters.

That’s all for the build/test job, but what about the pushing one? Let’s go over that too:

First 4 lines define when we want this job to be triggered. We specify that this job should start only when tags are pushed to repository (* specifies pattern of tag name – in this case – anything). This is so that we don’t push our Docker image to GitHub Package Registry every time we push to repository, but rather only when we push tag that specifies new version of our application.

Now for the body of this job — it starts by checking out source code and setting environment variable of RELEASE_VERSION to git tag we pushed. This is done using build-in ::setenv feature of GitHub Actions (more info here). Next, it logs into Docker registry using REGISTRY_TOKEN secret stored in repository and login of user who initiated the workflow ( github.actor). Finally, on the last line it runs push target, which builds prod image and pushes it into registry with previously pushed git tag as image tag.

You can out checkout complete code listing in the files in my repository here.

Code Quality Checks using CodeClimate

 
Last but not least, we will also add code quality checks using CodeClimate and SonarCloud. These will get triggered together with our test job shown above. So, let’s add few lines to it:

We start with CodeClimate for which we first export GIT_BRANCH variable which we retrieve using GITHUB_REF environment variable. Next, we download CodeClimate test reporter and make it executable. Next we use it to format coverage report generated by our test suite, and on the last line we send it to CodeClimate with test reporter ID which we store in repository secrets.

As for the SonarCloud, we need to create sonar-project.properties file in our repository which looks like this (values for this file can be found on SonarCloud dashboard in bottom right):

Other than that, we can just use existing sonarcloud-github-action, which does all the work for us. All we have to do is supply 2 tokens – GitHub one which is in repository by default and SonarCloud token which we can get from SonarCloud website.

Note: Steps on how to get and set all the previously mentioned tokens and secrets are in the repository README here.

Conclusion

 
That’s it! With tools, configs and code from above, you are ready to build and automate all aspects of your next Python project! If you need more info about topics shown/discussed in this article, then go ahead and check out docs and code in my repository here: https://github.com/MartinHeinz/python-project-blueprint and if you have any suggestions/issues, please submit issue in the repository or just star it if you like this little project of mine. 🙂

 
Resources

 
Bio: Martin Heinz is a DevOps Engineer at IBM. A software developer, Martin is passionate about computer security, privacy and cryptography, focused on cloud and serverless computing, and is always ready to take on a new challenge.

Original. Reposted with permission.

Related:

Source: https://www.kdnuggets.com/2020/09/automating-every-aspect-python-project.html

Big Data

British could lead the world with first sovereing data exchange

Avatar

Published

on

British could lead the world with first sovereing data exchange

The British island of Jersey, is ideally placed to become the world’s first privacy enhancing sovereign data exchange which will have huge benefits to the whole of the UK and beyond. Global Smart City expert and digital transformation consultant, Joe Dignan said the Covid-19 pandemic has highlighted the importance of a combined data view of a population, and now is the time to establish a trusted, highly regulated and ethical exchange. He has called on the Jersey government to support the idea for the benefit of everyone.

A sovereign data exchange is a regulated infrastructure that allows data owners to store, share and monetise their data while retaining ownership and privacy.

‘If dealing with the pandemic has taught us anything, it’s that single sources of data are meaningless unless synthesized with other data and visualised so we can understand it,’ said Joe.

‘Jersey is a microcosm where it controls the levers of the economy, legislature, government and security in an enclosed and agile environment that already has a digital twin. It also has all the necessary skills for regulatory and governance of that data, through its finance industry. This puts it in a unique global position to act as a data exchange which can bring huge health, economic and environmental benefits to the UK and elsewhere.’

Joe and a host of global technology experts, including Fintech titan, Nick Ogden, are discussing Jersey’s position as a sandbox for data innovation and digital testing at a series of free online events for Jersey Tech Week, 16th to 23rd October.

Nick set up what is believed to be the world’s first e-commerce business in the Island in 1994, before launching World Bank. Key figures from IBM, Ocado, World Bank, Carlsberg and Microsoft, will also demonstrate the very latest trends and developments for the industry, including insights into emerging tech trends – with fintech, artificial intelligence, digital health and creativity.

Joel Mills, the CEO of AugmentCity, will showcase the digital twin of Jersey, part of the United Nations’ United 4 Smart Sustainable Cities initiative, which is already helping to inform decision making.

‘Covid has been terrible for everyone, but it has speeded up the adoption of technology, from the simple need to work from home to the urgent need for good quality data and its use,’ said Joel.

‘The pandemic has shown that single sources of data are meaningless. This has opened up opportunities for us to make big improvements in the future with informed decisions using data from multiple sources and visualised for human understanding.

‘Using simulation in partnership with the UN’s smart sustainable development goals, allows us to connect humans and data like was never possible before. Breaking down barriers, fast tracking new technologies and reducing time and cost. If we are to beat the virus we need to embrace technologies and Jersey is playing a key role in prototyping this.’

Source: https://www.fintechnews.org/british-could-lead-the-world-with-first-sovereing-data-exchange/

Continue Reading

Big Data

EBANX announces expansion to Central America and the launch of EBANX GO within the LatAm region

Avatar

Published

on

EBANX announces expansion to Central America and the launch of EBANX GO within the LatAm region

EBANX, fintech company specialized in payment solutions for Latin America, announced the Push LatAm, an initiative that comprises expansion of its operations to new markets in Central and South Americas; the offering of hybrid services within Latin American countries; and the launch of EBANX GO, a prepaid card that offers a digital payments account in a partnership with Visa, to other markets in the region besides Brazil; all within the next 12 months. Push LatAm was announced this Thursday, October 15, at the Latin America Summit, EBANX annual event about business in Latin America.

EBANX expansion to Central America will start in Panama, Costa Rica, Dominican Republic and Guatemala. Paraguay, in South America, is also a destination. These five new markets will add to the current nine where the company already operates – Brazil, Mexico, Colombia, Argentina, Chile, Peru, Uruguay, Bolivia and Ecuador.

Besides the geographical expansion, Push LatAm also consists of model expansion. After having unveiled its local payment processing, EBANX will now launch its hybrid model, making things more flexible for global companies that have offices in the region, in a fully compliant way. By combining cross-border and local processing services within the same territories, this new model will allow local settlements for merchants, starting by South American countries.

And following a market trend of electronic payments and digital accounts that was accelerated by the pandemic in Latin America, EBANX will also launch EBANX GO in other LatAm markets. The e-wallet was soft launched in Brazil in the beginning of 2020, with a Visa card and a digital payments account, and has been growing steadily ever since. Around 60% of the current purchases made with the EBANX GO card are within EBANX merchants, proving the product can also work as a performance tool for them, besides being an easy-to-use payment option for consumers in the region.

“The Push LatAm initiative reflects our mission from the beginning of EBANX: to create access, to connect Latin Americans and global brands, always with a customer and product-driven mindset. Expanding our footprint and our solutions right now is the perfect realization of this goal. This will enable us to keep excelling in our commitment with Latin America: to be highly specialized in the region, translating each one of its countries and their singular cultures to businesses around the world,” said João Del Valle, co-founder and COO of EBANX.

Source: https://www.fintechnews.org/ebanx-announces-expansion-to-central-america-and-the-launch-of-ebanx-go-within-the-latam-region/

Continue Reading

Big Data

StructureFlow’s accelerated 2020 growth sees expansion of international operations, customer base and new hires

Avatar

Published

on

StructureFlow’s accelerated 2020 growth sees expansion of international operations, customer base and new hires
StructureFlow, a legal tech start-up helping lawyers and finance professionals quickly and easily visualise complex legal structures and transactions, announces two hires to its senior leadership team and continued expansion of its international operations and customer roster. The start-up is also currently participating in Allen & Overy’s ‘Fuse’ Incubator and Founders Factory’s 6-month FinTech accelerator programme as part of its growth strategy.

Founded by former corporate lawyer Tim Follett, StructureFlow is a cloud-based software that was developed to address the difficulties and inefficiencies he faced when trying to visualise complex legal structures and transactions using tools that were not up to the task. The start-up was formally launched earlier this year at a time when many firms were, and continue to be, heavily focused on finding new technologies that enable efficient collaborative working.

Global growth and expanding beyond the legal industry

StructureFlow opened its first international office outside of the UK in Singapore earlier this year and has been running successful pilots of its visualisation software with prestigious international law firms. In addition to the growing customer base in the UK – the company is expanding internationally and is excited to announce that it will be onboarding customers in India, Australia, the Netherlands, and Canada in the next month.

With the belief that accounting teams, investment banks, private equity firms and venture capital firms will also benefit from access to StructureFlow’s visual structuring tool, the start-up has begun venturing beyond its legal customer base working with a small number of asset management and private fund businesses. This includes M7 Real Estate, a leading specialist in pan-European, multi-tenanted commercial real estate investment and asset management operations.

“We decided to expand internationally despite the pandemic as there is a heightened need for new technologies to support global organisations who are restructuring business models to adapt to the ‘new normal’,” said Alex Baker, Head of Growth. “Our product helps law firms and other financial institutions to work securely whilst working from anywhere and our growing Singapore operations will allow us to better serve our customers in Asia Pacific.”

 

New hires join the senior leadership team

Jean-Paul de Jong joins StructureFlow as its Chief Technology Officer and Chief Security Officer along with Owen Oliveras its Head of Product.

With a background in enterprise software development, information security and a track record of many successful large-scale integrations, De Jong has held several prominent positions within regulated industries in both private and public sectors.

Oliver is a co-founder of Workshare Transact, the legal transaction management application that was acquired by Litera in 2019, having been previously a corporate lawyer with Fieldfisher.

Together, they bring decades of legal and technology leadership and expertise to expedite StructureFlow’s product development and will be instrumental in developing the software to meet the demands of the company’s broadening customer base.

Tim Follett, CEO of StructureFlow, commented on these developments, “The decision to further expand our presence across the legal and financial technology markets in Europe, Asia and North America is a logical step in our business growth strategy. The addition of Jean-Paul de Jong and Owen Oliver will bring first-class engineering, security and product expertise to our team, bolstering our ability to build and scale innovative enterprise products.”

Accelerator programmes to complement growth strategy

In a move to broaden its global presence and increase the impact of its product, StructureFlow has joined two reputable accelerator programmes this year. Following the company’s success as the inaugural winner of Slaughter and May’s Collaborate programme, StructureFlow has subsequently joined the fourth cohort of Fuse, Allen & Overy’s flagship legal tech incubator.
More recently, the team has partnered with Founders Factory, joining its FinTech accelerator programme giving StructureFlow unparalleled access to the programme’s corporate partners. The programme also includes mentorship supporting the company’s growth and impact of its product across the legal, financial services and other key sectors.

“Being accepted into two industry-acclaimed incubator and accelerator programmes is a crucial part of our expansion plans and will provide us with expert guidance to further develop our visualisation software. By utilising the expertise of industry experts, we expedite our plan of becoming a global platform for a range of organisations and stakeholders to visually engage with essential corporate information,” Follett added.

Source: https://www.fintechnews.org/structureflows-accelerated-2020-growth-sees-expansion-of-international-operations-customer-base-and-new-hires/

Continue Reading
Cannabis3 hours ago

Jay-Z announces new line of cannabis products dubbed Monogram

Energy4 hours ago

The Rockefeller Foundation commits USD1 billion to catalyze a green recovery from pandemic

Energy4 hours ago

PJM Named a Top Adoption-Friendly Company in the United States for 14th Consecutive Year

Blockchain4 hours ago

Top 10 Blockchain-as-a-Service (BaaS) Providers

Energy4 hours ago

Defining Value in Supplier Selection: An NSK Perspective

AR/VR4 hours ago

LBE VR: Past, Present and Post Civid Future

Blockchain News4 hours ago

Do I need to Buy One Whole Bitcoin? 3 BTC Questions I’m Tired of Answering

Energy5 hours ago

ReneSola Power and Novergy to Form Joint Venture to Develop Solar Projects in the UK

Energy5 hours ago

S&P Global Platts Announces Finalists for ‘Leadership in Energy Transition Award’

Blockchain5 hours ago

Founder´s Packs now available for the first AAA blockchain game BLANKOS BLOCK PARTY

Blockchain News5 hours ago

Kevin Hart Jokingly Calls Crypto “Voodoo Money” While Kanye West Takes Bitcoin Seriously on Joe Rogan’s Podcast

AR/VR5 hours ago

Lenovo to Sell Varjo’s Enterprise VR Headsets

AR/VR6 hours ago

The Virtual Arena: The Ascendance of Arena-Scale Entertainment – Part 2

Energy7 hours ago

ICL Agrees to Acquire Fertiláqua, a Leading Brazilian Specialty Plant Nutrition Company

Fintech7 hours ago

How these fintech partnerships are shaking up finance

Esports8 hours ago

Four key storylines of BLAST Premier Fall Series

Energy8 hours ago

FMC Corporation Announces New Executive Role, Vice President and Chief Sustainability Officer, and Elects New Vice President of Procurement and Global Facilities

Blockchain News8 hours ago

Ant Group Chairman Eric Jing: Blockchain Will be the New Standard of the Future Financial Infrastructure

Payments8 hours ago

Cross border payments part 1: the competition is really, really old 

Energy8 hours ago

Decarbonization Plus Acquisition Corporation Completes Initial Public Offering

Energy8 hours ago

SESCO Lighting Announces new CEO

Esports8 hours ago

Discussing roster changes, recent tournaments, and NA teams in Europe on HLTV Confirmed S5E12 with smooya

Energy9 hours ago

Humic-based Biostimulants Market worth $848 million by 2025 – Exclusive Report by MarketsandMarkets™

Fintech9 hours ago

The Carlyle Group to acquire Calastone

Biotechnology9 hours ago

Tyler Jacks, founding director of MIT’s Koch Institute, to step down

Esports10 hours ago

How to Shiny Hunt in Dynamax Adventures in Pokémon Sword and Shield’s The Crown Tundra expansion

Esports10 hours ago

March joins T1’s Dota 2 roster as head coach

Cyber Security10 hours ago

Huobi expands fiat gateway to support AUD, GBP and EUR through Banxa

Crowdfunding11 hours ago

P2P Lending Platform SeedIn Rebrands to BRDGE, Plans Expansion Into Indonesia

Aviation11 hours ago

Malaysia Airlines Operates More Than 200 Rescue and Repatriation Flights During RMCO, CMCO

Energy11 hours ago

Water Treatment Chemicals Market to Surpass $85,341.8 Million Revenue by 2030: P&S Intelligence

Energy11 hours ago

Daqo New Energy Announces ADS Ratio Change and Further Amendment and Restatement to Deposit Agreement

Esports11 hours ago

How to Shiny Hunt the Regis in Pokémon Sword and Shield’s The Crown Tundra expansion

Energy11 hours ago

Third party tests confirm HYZON Motors’ new liquid-cooled fuel cell stack leads the world in power density

Energy12 hours ago

Polyethylene Furanoate Films Market Size Worth $800.9 Thousand By 2035: Grand View Research, Inc.

Energy12 hours ago

Elkem signs MoU with FREYR for supply of battery materials

Aviation12 hours ago

Air Canada Adding Doha To Its Route Network With A Boeing 787-9

Esports12 hours ago

How to get Volcanion and Diancie in Pokémon Sword and Shield’s The Crown Tundra expansion

Aviation12 hours ago

Woman ‘locked in’ ambulance for Doha Airport genital exam

Aviation13 hours ago

Singapore Airlines’ Fleet In 2020

Trending