I’ve been dabbling with free Python classes for the majority of this past year now. I’ll agree with most experts that it’s a very intuitive and creative language to use for data science, but it definitely takes some time. After learning how to read this language with (some) confidence, I was able to truly understand its applications and potential. I’ll be the first to admit how novice I am, so with that, I’ll just say that this article helps me just as much as any other novice reading. Let’s struggle together, shall we?
Most people spend the first few Python lessons defining variables, changing data types, and making lists…maybe even a few loops here and there if you’re picking up things faster than I am. It was honestly a lot of typing and even starting to second-guess the efficiency of Python. Why invest in this technology if you have to type up so much just to make a framework for your data? I’m a GIS professional and I haven’y even gotten to my wheelhouse in Python; how could I possibly make it to my specialty if I can’t handle the basics. Well the basics are still developing, but now I understand Python’s true power (in my opinion), and that in it’s open-source and expansive libraries.
First, let’s just all agree that the terms modules, libraries, and functions all mean the same. Functions my just be the singular within modules, but I don’t think my brain’s ready for that differentiation yet. Essentially, instead of creating functions and layouts from scratch in Python, you can count, convert, and analyze to your hearts’ content with a simple import of a a library. Libraries are used to import a set of functions that are at your disposal without having to types them or their details. Think of it as a tool that you can cut and paste till the cows come home.
I’d also like to give all the credit of this article to the free and online Geo-Python 2019 course by the Department of Geoscience and Geography at the University of Helsinki. Geoscience should be open-source, as well as the building blocks required to speak this language. Their course gives an excellent comprehensive approach to the general Python language, and builds on those initial lessons to introduce VERY important libraries such as numpy and pandas.
In this article we will:
- Review basic concepts of functions and libraries
- Loading and using functions
- Good notation and practice
- Apply knowledge with a step-by-step example
As I prefaced, a function is a block of reusable code that performs specific tasks on your code. Using functions improves organization, readability, overall efficiency by reducing redundancy and wordiness. Below is the basic outline of the anatomy of a function.
The def keyword opens up the statement followed by the name of the function. It’s always good practice to keep this function name short, relevant, and unique. A good alternative to the function name in the above example could be “c_to_f” or “f_conv”. Next is the parameter which is the input variable for the tool. In this example, we should theoretically have data assigned to the temp variable, which will go into this function when called upon. Be sure to add a colon to the end of the parameter or else face the consequences of a big ol’ error. When a line becomes indented this is an indicator that the sequential statement is operating within the function above; this is why it’s important to use that colon, otherwise an indent will not occur. This may seem pretty basic, but I’ve spent far more time staring at the monitor combing for errors when it’s just a missing colon.
The return statement defines the transformation and return value of the function. The return value typically includes some kind of basic operation or mathematical transformation, using the before-mentioned parameter input. In other words, if there’s a task you want to replicate or store, it’s a good idea to create a function around it, and include the operation within the return value field.
It’s good to note that functions aren’t just mathematical, numerical operations. A function can imbed qualitative values into phrases such as, “Hi my name is ____”, or other examples. Some functions can also include multiple parameters. Below is an example that’d be very useful for a name for or electronic application portal. In the example, you can see a name as well as an age as the parameter, which are being used to in a string return.
This just scratches the surface, as functions can also be used within loops for iterative purposes, and have functions imbedded within functions. In this next example, I made a conditional statement within a function that classifies temperatures into ranges. The outputs are simply number ranks between 0 and 3, but they might as well mean cold, mild, warm, and hot. You can even include other weather features like rain and other conditions to add to the function’s depth and sophistication.
When it comes to saving and storing these useful functions, we need to create what’s known as a library or module. These are essentially useful textfiles that can carry the raw code for your tools instead of your active project. This is where you can organize and manage your tools effectively without flooding your active projects with functions. As a first step and rule, we need to create our textfile in the same directory as out project file. This is also known as our working directory, a file system that our code reads in order to carry out our project. It is hierarchical, meaning that we need to add the textfile to the same file and file level as our active project. Again, it’s very simple because it’s just the same folder as the rest of your project, but very easy to mixup and misunderstand. One common mistake I commonly do is somehow create a new folder and new level within my working directory; leaving me with a module that can’t be read or commit to my repository on Github. Below is a diagram to help fortify my explanation.
After creating the textfile, we need to rename it to a unique library name, with a .py suffix that replaces the textfile’s .txt suffix. Jupyter notebook will then start to recognize this file as a Python library instead of a simple textfile. After that, it’s just a matter of simply copying and pasting your functions in to the .py library file. To separate multiple tools, just simply include a line break. It’s always good practice to add a docstring to explain what each tool does as well as what each return value does to your input parameter. You never know, you might forget what each tool does, or you might share them with a friend who needs to get caught up a bit.
As the diagram shows above, the next thing to do is just import your library into your project file. If the library is in your working directory, then a simple import command should import all of the functions in the .py library. There are other ways to import individual functions from a library instead of entire libraries, but I think this is the easiest way to at least call on the functions later down the road. As I just prefaced, the way to call on your library’s functions in your project file just requires an abbreviation prefix (you can pick your prefix in the import command line, my example is something I just made up). Consistency is key because you need to use your prefix for the rest of the project to call upon the library’s functions.
I think that just about covers the extent of my knowledge on this subject. If you made it this far, past the typos, ramblings, and vagueness, then I really appreciate that! This was a good summary and review for me, that really broke down a fundamental conceptual understanding of code, which to me personally, is very lacking in this world, so I hope it helps other people like me as well. Cheers.