GIS—where to even begin?
Posted by jklaiho@reddit | ExperiencedDevs | View on Reddit | 27 comments
Backend developer (Python) here. I've been at this for over 20 years now, and I've gotta say, GIS stuff is the most impenetrable and intimidating area I've had to deal with. So far I've only had to do spot fix type of stuff to code made by people who knew what they were doing, but I lack any proper general understanding. Stack Overflow has saved my ass a lot of times. I'm very much in the "I don't even know what I don't know" stage.
A task that may be coming my way in the near future (pending some client negotiations) is converting some scripts that use raster GeoTIFFs to use equivalent vector GeoPackage files, as the source organization has changed the way they distribute their materials. I've looked at the scripts briefly, and am dreading the day. There's fuck all for documentation, as one might guess, which doesn't help matters.
It feels like working with anything GIS-related needs PhDs in both computer science and geography. I remember booting up ArcGIS several years ago for some random conversion task. I've no problem learning to use DaVinci Resolve or Autodesk Fusion from scratch to an intermediate level for some random hobby projects, but ArcGIS kicked my ass.
Whoever here who has had to learn GIS dev from scratch, how did you approach it?
PickleLips64151@reddit
Geographer here. I spent 15 years in the GIS space before becoming a software engineer.
While I have a degree in Geography, I'm self-taught when it comes to GIS. I tested out of my GIS requirements for my degree. I had been doing GIS for almost 3 years full-time before I even started working on my Geography degree.
I used ArgGIS back when it was version 3.1 and v8. I was a beta-tester for ArcGIS Pro. I used ArcIMS and later ArcServer.
I read tons of documentation from ESRI's site on what each tool did and how. If you're using ESRI, it's essential because those tools often use assumptions that will not give you the best results. Like creating a hotspot, ESRI's tool divides the longest side of the minimally enclosing polygon that fits your data and divides it by 155 to get your resulting raster size. That's just insane for anything that has more than 20 square miles of area.
I recommend downloading QGIS (FOSS GIS software) and working through their documentation and tutorials. There are tons of tutorial videos.
[How To Lie With Maps](https://www.amazon.com/How-Maps-Third-Mark-Monmonier/dp/022643592X) is an excellent book to help make you a better GIS developer. It should help you avoid some of the pitfalls.
If you need more in-depth GIS skills to progress your career, I recommend taking some GIS courses at the local Community College. I might even take a GIS programming class, not for programming knowledge but for the `what tool I need to solve problem X` knowledge.
GIS analysis and tooling involve using and running many tools in the proper sequence to get the correct results. Choosing the wrong tool, the wrong sequence, or (obviously) the incorrect data will result in wrong answers. I see maps online all the time that are technically correct, but the substance of the map is misleading or inaccurate to the point that the map is useless. Don't get me started on heat maps. About 90% of them are total bunk.
madprgmr@reddit
Heat maps? You mean yet another map of population centers?
PickleLips64151@reddit
They shouldn't be. But most people don't adjust the search radius used to determine a specific point's cluster intensity. So yeah, they turn into population maps.
A hotspot is supposed to be a representation of point data that is closer together than it would be via random distribution. The basis for determining how close things should be is a nearest neighbor analysis. The basis for that calculation is the study area.
ESRI uses a minimally enclosing polygon (MEP) to measure that. That's cool if your data naturally can occur anywhere in that square. But generally, it's a garbage method.
MEP will include areas where your data cannot exist, which causes the analysis to determine the nearest neighbor distance as being much farther apart than it should be. When compared to the actual NND, everything looks clustered. Which turns a simple hotspot analysis into a population map.
The formula is 1/2 the square root of the density (n/area). So if the area is too big, the NND is too big.
I did a demo of this once where I used the previous analyst's results and showed several decreasing search distances, with the last one being mine. Turned it into a short animation. It was like watching amebas reproduce. The hotspots in their previous analysis shrunk and split into 2-3 smaller hotspots. Some completely disappeared.
It can be done correctly, but it takes understanding what the tool is supposed to do and applying that in the proper way.
madprgmr@reddit
Interesting. I've not had a chance to do anything production grade with heatmaps - just some simple hierarchical clustering based on zoom level for interactive visualization of trip request counts (to illustrate speculative transit vehicle routing).
I try to keep up with different ways visualizations can be misleading (as I have seen many examples in my life), but I am self-taught and haven't touched ArcGIS itself (just some of ESRI's tooling to consume data from it)... so it's always interesting to learn information about that side of things.
vloewe@reddit
agree! what helped me when starting: tiny experiments in geopandas and lots of QGIS. Afterwards I have actually started building a cloud-based GIS software that is a lot simpler; Atlas.co, might be worth a look. I think the most important is finding a real project to learn by doing
PickleLips64151@reddit
There is a book list at [gisittools.com](https://www.gisittools.com/books.php) that you might find helpful. Lots of cookbooks and some college-level intro books. Even within the specialized domain of GIS, there are even more specialized domains: retail, medical, urban, hydrology, etc. It's a rabbit warren to be sure.
QuitTypical3210@reddit
Just use ESRI. But ESRI has issues rooted in its core.
metaphorm@reddit
learn PostGIS
it's an outstanding tool that allows a Postgres database to act as a fully functional GIS engine.
nyanyabeans@reddit
I’ve used this for features and it’s actually incredible. Very well documented, and what’s not in its official documentation is well explained in stackoverflow/exchange and places like Wikipedia. Their utils are also very direct, 1:1 implementations of fairly “universal” GIS concepts, so in learning how it works in PostGIS you at least get an intro to the GIS context around it.
jklaiho@reddit (OP)
Yeah, I’ve used it but at a very abstracted level (via GeoDjango).
metaphorm@reddit
GeoDjango is nice and solves a lot of the common use cases elegantly, but it's deliberately streamlined so not a great way to learn the domain in depth.
PostGIS does just about everything you might want to do and learning it will be a very good way to learn GIS fundamentals too. I'd focus in particular on learning the main geometric datatypes (points, lines, polygons, poly-line-strings, and collections of these) and then also learning some of the particulars of using this stuff in the context of the actual surface of the earth (i.e. it's an oblate spheroid, and projections from planar to spheroidal geometry are important).
SatisfactionGood1307@reddit
This is the answer
sgtBakerHereAgain@reddit
I recommend looking into pg_featureserv and pg_tileserv from crunchydata. Both extend postgis into standardised rest apis.
OogalaBoogala@reddit
It’s just spending the time to learn the data types, learn coordinate systems, and learning some software. This is really any highly domain specific programming, there’s always more to it.
Anyway, I’d probably get started by messing around in QGIS. It’s free, open source, and has a robust Python API, so you’ll be able to quickly iterate with your current skills. Learning through QGIS’s UI, then implementing it as code has been my go to in the past.
jklaiho@reddit (OP)
I think a big part of the problem is that I’ve always had to dive into the deep end to spot fix other people’s code, with no time to learn the basics. At my age, I don’t hobby code in my free time anymore; I have more interesting things to do. Working in a consultancy as a jack-of-all-trades type of guy, I don’t get much time to just mess around and learn large new domains like GIS.
midasgoldentouch@reddit
Can you spend an hour a day learning? And I mean an hour of your workday. Back at a consultancy I got tapped to make some styling changes in a set of iOS apps for a project. I had never done mobile development before, so I informed my manager and PM that while the design changes were being finalized, I would need to spend an hour or two learning the basics of Swift and iOS development.
If you need to learn the basics to be able to take on various tasks then you should make the case to have a certain number of hours devoted to that.
jklaiho@reddit (OP)
Yeah, possibly; my days tend to be hard to predict. This thread contains a bunch of good pointers to resources that I’ll try to look into.
sebrack10@reddit
adding to this excellent advice, be sure to learn about the most common data types and file extensions for vector + raster layers and some popular Python GIS libraries like geopandas, rasterio and GDAL. you can even use them inside QGIS with its integrated Python terminal
diegoeripley@reddit
I would start with this [1], Dr. Wu has a lot of useful content in his YouTube.
[1] https://youtube.com/playlist?list=PLAxJ4-o7ZoPfb18kNe2luWX9xKg1233i9&si=R_yB6mYiU0S2oQLN
ButchDeanCA@reddit
You sound like me a few years ago. Also 20yoe here suddenly thrust into the world of GIS, spent many near sleepless nights trying to figure things out in the early months.
What I think is important here is that you focus your efforts on what the current codebase does over going off on your own learning various related tools here and there; it’s a trap doing the latter because ultimately how “GIS” is utilized is application dependent, so you will see something done some way on Stack Overflow then realize there will be issues trying to employ the technique on the codebase you are working on.
I’m still quite new to the field myself, but the only real advice I can give is to take a deep breath, accept that some brick walls are inevitable but you will get over them, others around you using these tools are also still learning and you are not as far behind as you think.
viskis22@reddit
Full stack dev here, joined a GIS heavy project ~3 years ago. I had 0 GIS knowlede before that, now I feel quite comfortable with the tooling. Things that helped me: * Honeslty, QGIS or any similar application is a must. I couldn't have learned the GIS python tools without inspecting how they modified rasters or vectors. It's one thing to read documentation about a GIS operation, quite another to see its outcome visually. In my case QGIS is open every day, and I'm not even great at it. 90% of the time I use it just to inspect my rasters. Maybe ocasionally I will apply very basic GIS operations inside QGIS, but super basic and common stuff that's easy to google. * Today AI tooling got a lot better for GIS tasks, so use it. Ask it to explain or generate GIS code. For me it still makes mistakes, so I'm not trusting its outputs fully. But it can really help you to get going. * I don't know if you will be working on a team of any sorts. But if you will have colleagues that know GIS, then you got lucky. Don't be shy to ask for their help cause for sure they will speed up the learning process.
ScriptingInJava@reddit
Go through the catalogue of OpenStreetMap tools (Nominatim, OSRM etc) and get them built locally. Hook them up into a backend API and create a small application, you'll quickly learn how to piece the puzzle together.
Next add some functionality like bounding box searches on a map, which translates to geospatial querying within your database (or using PostGIS extensions in your OSRM database). You'll quickly spinoff into other segments.
musty_mage@reddit
It's easy. Start by studying astronomy and take a few advanced courses in geodesy and hey presto! GIS is easy-peasy :)
TruthOf42@reddit
What ecosystem are you working with? ESRI has lots of documentation at least. The first thing I would familiarize yourself with is coordinate systems and translating between them. Everything else is relatively intuitive, but if you don't understand coordinate systems and why they are important, you are just going to be way over your head constantly.
madprgmr@reddit
Depends on what specific aspect of GIS you are struggling with. Coordinate systems? Tile slicing? Just operating ArcGIS?
jklaiho@reddit (OP)
All of it. ”How to do anything with GIS in Python” and what theory I need to master to even understand what the tool/library docs are talking about.
madprgmr@reddit
I don't know where to start for "all of it". Focus on a specific problem you are trying to solve and then work through the gaps in understanding until you see how you can solve it.
There are a whole bunch of GIS libraries, including for python (ex: geopandas, arcpy (from ESRI), etc.). There are also GIS plugins for Postgres and other databases that simplify storing/queryig/manipulating GIS data.