{"id":4660,"date":"2022-03-24T17:19:39","date_gmt":"2022-03-24T16:19:39","guid":{"rendered":"https:\/\/www.it-ps.at\/?p=4660"},"modified":"2024-10-16T11:10:32","modified_gmt":"2024-10-16T09:10:32","slug":"multi-stage-python","status":"publish","type":"post","link":"https:\/\/www.it-ps.at\/en\/multi-stage-python\/","title":{"rendered":"Multi Stage Python"},"content":{"rendered":"\n<p>When shipping applications using containers, one often is confronted with overly large final images. Multi-stage builds are a common way to circumvent this issue, especially for compiled languages like Go or Java. In our latest blog post we show how to utilise multi-stage builds for python images to bring down image sizes and thereby improving security.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Using docker multi-stage builds to reduce image size<\/h2>\n\n\n\n<p>When writing container images, it is always preferable to have smaller images and to only include what is really needed. This has two main advantages:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>images take up less storage space<\/li><li>security vulnerabilities of software that is not installed cannot be exploited<\/li><\/ul>\n\n\n\n<h3 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Example dockerfile<\/h3>\n\n\n\n<p>Let&#8217;s take a very simple image that starts off with an official python-docker image, using a slim version of the current stable debian release, bullseye. We are going to use python 3.8, but this is not important. The image is pulled from <a href=\"https:\/\/hub.docker.com\/_\/python\" target=\"_blank\" rel=\"noreferrer noopener\">docker hub<\/a>.<br><br>The dockerfiles are provided in <a href=\"https:\/\/github.com\/it-power-services\/python-multi-stage-builds\" target=\"_blank\" rel=\"noreferrer noopener\">this repository<\/a> in the docker directory and can be built using <a href=\"https:\/\/pypi.org\/project\/pyodbc\/\" target=\"_blank\" rel=\"noreferrer noopener\">&#8216;.\/bin\/build.sh&#8217;<\/a> on Linux.<\/p>\n\n\n\n<p><a id=\"_msocom_1\"><\/a><\/p>\n\n\n\n<h3 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Reducing the image size<\/h3>\n\n\n\n<h4 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">The all-in-one solution<\/h4>\n\n\n\n<p>We are simply going to `COPY` over a `requirements.txt` and install it using `pip`, as this is a fairly common use case. I have chosen to install pyodbc since it has some system<br>dependencies that have to be installed.<br><br><\/p>\n\n\n\n<p><em>Note: The dockerfiles are a <a href=\"https:\/\/stackoverflow.com\/help\/minimal-reproducible-example\" target=\"_blank\" rel=\"noreferrer noopener\">m<\/a><a>inimal reproducible example<\/a> and do not <em>follow some common best practices!<\/em><\/em><a id=\"_msocom_1\"><\/a><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>```dockerfile\nFROM python:3.8-slim-bullseye\n\nRUN apt-get update &amp;&amp; \\\n    apt-get install -y gcc g++ unixodbc-dev\n\nCOPY requirements.txt \/tmp\/requirements.txt\n\nRUN pip3 install --no-cache-dir -r \/tmp\/requirements.txt\n```<\/code><\/pre>\n\n\n\n<p>After building the image, we get an image size of 422MB:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>```bash\n$ docker images original\nREPOSITORY   TAG       IMAGE ID       CREATED              SIZE\noriginal     latest    09ec99a1bafa   About a minute ago   422MB\n```<\/code><\/pre>\n\n\n\n<h4 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Introducing: multi-stage<\/h4>\n\n\n\n<p><a href=\"https:\/\/docs.docker.com\/develop\/develop-images\/multistage-build\/\" target=\"_blank\" rel=\"noreferrer noopener\">Multi-stage builds<\/a> are a neat way to keep build dependencies from blowing up the size of your final image. I am not going to go into detail on how they work.<\/p>\n\n\n\n<p>I am going to create wheels. From the <a href=\"https:\/\/pip.pypa.io\/en\/stable\/cli\/pip_wheel\/\" target=\"_blank\" rel=\"noreferrer noopener\">pip documentation<\/a>: <\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><em>&#8220;Wheel is a built-package format, and offers the advantage of not recompiling your software during every install.&#8221;<\/em><\/p><\/blockquote>\n\n\n\n<p>And this is exactly what we want. Compile it and then reuse it without compilation.<br><br>The original dockerfile is split in 2: the builder image will install the build dependencies and create the wheels. The final image will install the packages from the wheels without the need to install any build tools:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>the wheels. The final image will install the packages from the wheels without the need to install any \n```dockerfile\nARG WHEEL_DIST=\"\/tmp\/wheels\"\n\nFROM python:3.8-slim-bullseye as builder\n\nARG WHEEL_DIST\n\nRUN apt-get update &amp;&amp; \\\n    apt-get install -y gcc g++ unixodbc-dev\n\nCOPY requirements.txt \/tmp\/requirements.txt\n\nRUN python3 -m pip wheel -w \"${WHEEL_DIST}\" -r \/tmp\/requirements.txt\n\n\nFROM python:3.8-slim-bullseye\n\nARG WHEEL_DIST\n\nCOPY --from=builder \"${WHEEL_DIST}\" \"${WHEEL_DIST}\"\n\nWORKDIR \"${WHEEL_DIST}\"\n\nRUN pip3 --no-cache-dir install *.whl\n``` <\/code><\/pre>\n\n\n\n<p>The size of the image has gone down significantly:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>```bash\n$ docker images multi-stage\nREPOSITORY    TAG       IMAGE ID       CREATED          SIZE\nmulti-stage   latest    b90d97997b3b   44 seconds ago   128MB\n```<\/code><\/pre>\n\n\n\n<p>The difference is starker when considering the size of the base image:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>```bash\ndocker images python:3.8-slim-bullseye\nREPOSITORY   TAG                 IMAGE ID       CREATED      SIZE\npython       3.8-slim-bullseye   caf584a25606   5 days ago   122MB\n```<\/code><\/pre>\n\n\n\n<h3 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Increasing security<\/h3>\n\n\n\n<p>Smaller images can contribute to more security.<br>To illustrate this point, we are going to use <a href=\"https:\/\/github.com\/aquasecurity\/trivy\" target=\"_blank\" rel=\"noreferrer noopener\">trivy<\/a> to scan our images for known security vulnerabilities.<br><br>The base image has 85 vulnerabilities:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>LOW: 12<\/li><li>MEDIUM: 35<\/li><li>HIGH: 30<\/li><li>CRITICAL: 8<\/li><\/ul>\n\n\n\n<p>The original image that includes the build tools has 331 total vulnerabilities:<\/p>\n\n\n\n<p><a id=\"_msocom_1\"><\/a><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>UNKNOWN: 2<\/li><li>LOW: 24<\/li><li>MEDIUM: 175<\/li><li>HIGH: 109<\/li><li>CRITICAL: 21<\/li><\/ul>\n\n\n\n<p>The multi-stage image again has 85 images that it &#8220;inherits&#8221; from the base image, but does not introduce any new ones.<br><a id=\"_msocom_1\"><\/a><\/p>\n\n\n\n<h3 class=\"has-col-16-abdd-color has-text-color wp-block-heading\">Summary<\/h3>\n\n\n\n<p>When using python, do utilise wheels and multi-stage builds to decrease the image size and increase the security of your deployements.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When shipping applications using containers, one often is confronted with overly large final images. Multi-stage builds are a common way to circumvent this issue, especially for compiled languages like Go [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":4666,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_EventAllDay":false,"_EventTimezone":"","_EventStartDate":"","_EventEndDate":"","_EventStartDateUTC":"","_EventEndDateUTC":"","_EventShowMap":false,"_EventShowMapLink":false,"_EventURL":"","_EventCost":"","_EventCostDescription":"","_EventCurrencySymbol":"","_EventCurrencyCode":"","_EventCurrencyPosition":"","_EventDateTimeSeparator":"","_EventTimeRangeSeparator":"","_EventOrganizerID":[],"_EventVenueID":[],"_OrganizerEmail":"","_OrganizerPhone":"","_OrganizerWebsite":"","_VenueAddress":"","_VenueCity":"","_VenueCountry":"","_VenueProvince":"","_VenueState":"","_VenueZip":"","_VenuePhone":"","_VenueURL":"","_VenueStateProvince":"","_VenueLat":"","_VenueLng":"","_VenueShowMap":false,"_VenueShowMapLink":false,"footnotes":""},"categories":[28,48],"tags":[],"class_list":["post-4660","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai","category-solutions-services"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/posts\/4660","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/comments?post=4660"}],"version-history":[{"count":1,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/posts\/4660\/revisions"}],"predecessor-version":[{"id":9889,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/posts\/4660\/revisions\/9889"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/media\/4666"}],"wp:attachment":[{"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/media?parent=4660"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/categories?post=4660"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.it-ps.at\/en\/wp-json\/wp\/v2\/tags?post=4660"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}