Hey there!
I’m a chemical physicist who has been using python (as well as matlab and R) for a lot of different tasks over the last ~10 years, mostly for data analysis but also to automate certain tasks. I am almost completely self-taught, and though I have gotten help and tips from professors throughout the completion of my degrees, I have never really been educated in best practices when it comes to coding.
I have some friends who work as developers but have a similar academic background as I do, and through them I have become painfully aware of how bad my code is. When I write code, it simply needs to do the thing, conventions be damned. I do try to read up on the “right” way to do things, but the holes in my knowledge become pretty apparent pretty quickly.
For example, I have never written a class and I wouldn’t know why or where to start (something to do with the init method, right?). I mostly just write functions and scripts that perform the tasks that I need, plus some work with jupyter notebooks from time to time. I only recently got started with git and uploading my projects to github, just as a way to try to teach myself the workflow.
So, I would like to learn to be better. Can anyone recommend good resources for learning programming, but perhaps that are aimed at people who already know a language? It’d be nice to find a guide that assumes you already know more than a beginner. Any help would be appreciated.
To add to this, there are kinda two main use cases for OOP. One is simply organizing your code by having a bunch of operations that could be performed on the same data be expressed as an object with different functions you could apply.
The other use case is when you have two different data types where it makes sense to perform the same operation but with slight differences in behavior.
For example, if you have a “real number” data type and a “complex number” data type, you could write classes for these data types that support basic arithmetic operations defined by a “numeric” superclass, and then write a matrix class that works for either data type automatically.
Not OP, but also interested in wrapping my head around OOP and I still struggle with this in a few different respects. If what I’m writing isn’t a full program, but more like a few functions to process data, is there still a use case for writing it in an OOP style? Say I’m doing what you describe, operating on the same data with different functions, if written properly couldn’t a program do this even without a class structure to it? 🤔
Perhaps it’s inelegant and terrible in the long term, but if it serves a brief purpose, is it more in the case of long term use that it reveals its greater utility?
Yeah thats kinda where the first object oriented programming came from. In C (which doesn’t have classes) you define a struct (an arrangement of data in memory, kinda like a named tuple in Python), and then you write functions to manipulate those structs.
For example, multiplying two complex vectors might look like:
Programmers decided it would be a lot more readable if you could write code that looked like:
Or even just;
(This last iteration is an example of “operator overloading”).
So yes, you can work entirely without classes, and that’s kinda how classes work under the hood. Fundamentally object oriented programming is just an organizational tool to help you write more readable and more concise code.
I use classes to group data together. E.g.
@dataclass.dataclass class Measurement: temperature: int voltage: numpy.ndarray current: numpy.ndarray another_parameter: bool def resistance(self) -> float: ... measurements = parse_measurements() measurements = [m for m in measurements if m.another_parameter] plt.plot( [m.temperature for m in measurements], [m.resistance() for m in measurements] )
This is much nicer to handle than three different lists of temperature, voltage and current. And then a fourth list of resistances. And another list for
another_parameter
. Especially if you have more parameters to each measurement and need to group measurements by these parameters.