Author(s): Stephen Chege-Tierra Insights Originally published on Towards AI. Created by the author with DALL E-3 At one point in your life I am sure you have interacted with a nice neighbor, you know, the one who would greet you on your way to work or school, ask how your day was, help you carry groceries to your house or if you are lucky they would bring baked pie to your home. We also have the opposite of a lovely neighbor, a nosey inconsiderate Neighbor you dread living next to. In other words, neighbors play a major part in our life. Now, in the realm of geographic information systems (GIS), professionals often experience a complex interplay of emotions akin to the love-hate relationship one might have with neighbors. Enter K Nearest Neighbor (k-NN), a technique that personifies the very essence of propinquity and Neighborly dynamics. As GIS experts navigate through the spatial landscape, they grapple with the intricacies of k-NN, sometimes embracing its insights with open arms, while at other times feeling the frustration of its limitations. Let us look at how the K Nearest Neighbor algorithm can be applied to geospatial analysis. What is K Nearest Neighbor? A non-parametric, supervised learning classifier, the K-Nearest Neighbors (k-NN) algorithm uses proximity to classify or predict how a single data point will be grouped. It is among the most widely used and straightforward regression and classification classifiers in machine learning today. Esri defines K-Nearest Neighbor classifier as a nonparametric classification method that classifies a pixel or segment by a plurality vote of its Neighbor. K is the defined number of Neighbors used in voting. K-Nearest Neighbors (k-NN) is like asking your neighbors for information, you look at what your closest neighbors are doing to decide what to do next. To categorize a place on a map, for instance, by figuring out if it’s a city or a forest, you look at the spots that are closest to you and identify what they are. If the majority of them are woodlands, you could assume that the new site is likewise a forest. Evelyn Fix and Joseph Hodges created k-NN for the first time in 1951 while conducting research for the US military. They released a paper outlining the non-parametric classification technique known as discriminant analysis. Thomas Cover and Peter Hart published their “Nearest Neighbor Pattern Classification” paper in 1967, which further developed the non-parametric classification technique. Today, the k NN algorithm is the most widely used algorithm due to its adaptability to most fields, from genetics, finance, environmental analysis and customer service. How can it Be Applied to Geospatial Analysis? In geospatial analysis, satellite images can be categorized into many groups using k-NN according to attributes like colour, texture, and shape. One way to train a k-NN algorithm is to use a dataset of satellite photos that have been classified as cities or forests. Based on how closely new photographs resemble the training set, the system may be trained to distinguish between forests and cities. k-NN can be applied to geographic clustering, which is the process of grouping comparable features according to their attribute similarity and physical proximity. This is useful for market segmentation, which divides consumers and companies into clusters according to factors like demography and proximity to one another. k-NN works especially well with Google Earth Engine, R studios and Python for geospatial analysis. Using Google Earth Engine can assist in detecting deforestation in Kenya, charcoal burning in Somalia, soil erosion, forestation and Ocean pollution. Researchers and environmentalists can use geographical proximity to detect and track environmental changes by integrating KNN algorithms into these platforms. This will help to make better decisions and implement sustainable management techniques. Benefits of k-NN for GIS 1. Easy to understand– k-NN is user-friendly to GIS pros as it works on the concept of proximity or nearness, making it easy to understand and easy to implement, especially when it comes to crucial assignments such as spatial analysis. 2. Expandability– Spatial datasets that are frequently encountered in GIS applications can be efficiently analyzed thanks to k-NN’s expandability. KNN can easily handle increasingly big and complicated spatial datasets thanks to advancements in parallel processing techniques and computer resources. 3. Accessible to GIS Platforms– The K Nearest Neighbors is accessible to GIS platform libraries such as R studios, Google Earth Engine, and Python which are open sources. k-NN can be easily integrated into a GIS workflow through these libraries for effective analysis. 4. No Training Period– k-NN modeling does not include a training period as the data itself is a model that will be the reference for future prediction, and because of this, it is very time efficient in terms of improvising for random modeling on the available data. k-NN for Python code sample # Importing necessary librariesfrom sklearn.datasets import load_irisfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.metrics import accuracy_score# Load the Iris datasetiris = load_iris()X = iris.datay = iris.target# Split the dataset into training and testing setsX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)# Standardize the featuresscaler = StandardScaler()X_train = scaler.fit_transform(X_train)X_test = scaler.transform(X_test)# Instantiate the K-NN classifierknn = KNeighborsClassifier(n_neighbors=5)# Train the K-NN classifierknn.fit(X_train, y_train)# Predict the classes for test sety_pred = knn.predict(X_test)# Calculate accuracyaccuracy = accuracy_score(y_test, y_pred)print("Accuracy:", accuracy) This is a simple example of using the scikit-learn module to create the K-Nearest Neighbors classification in Python. Before executing this script, make sure scikit-learn is installed (pip install scikit-learn). How to get started 1. Decide which platform is best for you- As mentioned earlier, K Nearest Neighbors is available on Python, R studios and Google Earth Engine as a library for GIS, It is upon you to decide which platform suits you best and utilize it. 2. Make use of documentation and tutorials– These platforms that k-NN is available on also have a vast amount of resources that can help you implement this algorithm with ease. This will be very helpful when it comes to debugging. 3. Understand the concepts– Find out what each […]
↧