# How to apply for Artificial Neural Network Amazon Marketplace dataset?

Updated: Nov 14

## EDA -Artificial Neural Network-Amazon UK Products

About Dataset

Overview: This __dataset__ provides a comprehensive overview of various Amazon grocery products. It captures vital statistics such as sales figures, revenue, ratings, and other essential metrics. This data was collected using the Helium10 tool.

Usage: This dataset can be used for various purposes, including:

Market analysis for grocery products on Amazon. Brand comparison based on sales, reviews, and ratings. Predictive modeling for sales or revenue. Analyzing the relationship between price, sales, and reviews. Understanding the importance of product images in sales.

Acknowledgments: Data was collected using the Helium10 tool.

Let's begin by performing an Exploratory Data Analysis (EDA) on the provided dataset. EDA is a crucial step in understanding your data before moving on to modeling or further analysis.

### Step 1: Load the Data & Preliminary Inspection

First, we'll load the dataset and inspect the first few rows to get a sense of the data structure and the kind of information it contains.

```
import pandas as pd
data = pd.read_csv("../input/amazon-uk-grocery-dataset-unsupervised-learning/dataset.csv")
data.shape
```

`(6341, 20)`

From the initial inspection of the dataset, we can observe the following columns:

Product Details: Description of the product.

ASIN: Amazon Standard Identification Number.

Brand: Brand of the product.

Price: Price of the product.

Sales: Number of units sold.

Revenue: Total revenue generated from the product.

BSR: Best Sellers Rank.

FBA Fees: Fees associated with Amazon's Fulfillment by Amazon service.

Active Sellers: Number of sellers actively selling this product.

Ratings: Average user rating for the product.

Review Count: Number of reviews.

Images: Number of images for the product listing.

Review velocity: Possibly the rate at which reviews are added.

Buy Box: Information on which seller holds the "Buy Box" on the product page.

Category: Product category.

Size Tier: Product size classification by Amazon.

Delivery: Type of delivery (e.g., fulfilled by Amazon).

Dimensions: Product dimensions.

Weight: Product weight.

Creation Date: Date when the product was added.

### Step 2: Data Cleaning & Preprocessing

Let's check for missing values and data types to ensure we can perform further analysis without issues.

*# Check **for** missing values *
missing_values = data.isnull().sum()
*# Check data types*
data_types = data.dtypes
missing_values, data_types

```
(Product Details 0
ASIN 0
Brand 17
Price 1264
Sales 1802
Revenue 1511
BSR 453
FBA Fees 2636
Active Sellers # 51
Ratings 828
Review Count 828
Images 492
Review velocity 50
Buy Box 1932
Category 0
Size Tier 1529
Delivery 1089
Dimensions 1516
Weight 1437
Creation Date 1
dtype: int64,
Product Details object
ASIN object
Brand object
Price float64
Sales float64
Revenue float64
BSR float64
FBA Fees float64
Active Sellers # float64
Ratings float64
Review Count float64
Images float64
Review velocity float64
Buy Box object
Category object
Size Tier object
Delivery object
Dimensions object
Weight float64
Creation Date object
dtype: object)
```

From the information provided:

Missing Values:

Brand: 17 missing

Price: 1,264 missing

Sales: 1,802 missing

Revenue: 1,511 missing

BSR (Best Sellers Rank): 453 missing

FBA Fees: 2,636 missing

Active Sellers #: 51 missing

Ratings: 828 missing

Review Count: 828 missing

Images: 492 missing

Review velocity: 50 missing

Buy Box: 1,932 missing

Size Tier: 1,529 missing

Delivery: 1,089 missing

Dimensions: 1,516 missing

Weight: 1,437 missing

Creation Date: 1 missing

Data Types: Some columns like "Sales" and "Revenue" are of type 'object', which indicates they might contain non-numeric characters. We would need to convert them to the appropriate numerical format for analysis. Similarly, the "Creation Date" is of type 'object', and we might need to convert it into a datetime format.

Insights & Analysis: Before cleaning the data, let's derive a few insights based on the current information:

Which brands have the most products listed?

Distribution of products across categories. Average rating distribution.

Let's start with these insights.

```
import matplotlib.pyplot as plt
```*# Insight **1**:** Brands **with** the most products listed*
top_brands = data['Brand'].value_counts().head(10)
*# Insight **2**:** Distribution **of** products across categories*
category_distribution = data['Category'].value_counts()
*# Insight **3**:** Average rating distribution*
rating_distribution=data['Ratings'].dropna().value_counts().sort_index()
*# Plotting the insights*fig, axs = plt.subplots(3, 1, figsize=(14, 18))
*# Top brands*
top_brands.plot(kind='bar', ax=axs[0], color='skyblue')axs[0].set_title('Top 10 Brands with Most Products Listed')
axs[0].set_ylabel('Number of Products')
axs[0].set_xlabel('Brand')
*# Category distribution*category_distribution.plot(kind='bar', ax=axs[1], color='lightcoral')
axs[1].set_title('Distribution of Products Across Categories')
axs[1].set_ylabel('Number of Products')
axs[1].set_xlabel('Category')
*# Rating distribution*
rating_distribution.plot(kind='bar', ax=axs[2], color='lightgreen')
axs[2].set_title('Average Rating Distribution')
axs[2].set_ylabel('Number of Products')
axs[2].set_xlabel('Average Rating')
plt.tight_layout()
plt.show()

Here are some insights from the provided dataset:

Top 10 Brands with Most Products Listed: The bar chart shows the top 10 brands with the most products listed on Amazon. This provides an overview of the dominant brands in the dataset.

Distribution of Products Across Categories: The products are distributed across various categories. Some categories have a significantly higher number of products listed compared to others. This distribution can give insights into which categories are more populated or popular. The majority of products fall into the 'Grocery' category, followed by 'Food Cupboard' and 'Home & Kitchen'.

Average Rating Distribution: Most products have a rating between 4 and 5, indicating that many products are well-received by customers. There are fewer products with ratings below 4.

```
import matplotlib.pyplot as pltimport
seaborn as sns
```*# List **of** columns to visualize*
columns_to_visualize = ['Price', 'Sales', 'Revenue', 'BSR', 'Ratings']
*# Set the style*
sns.set_style("whitegrid")
*# Plot distributions **for** the selected columns*
fig, axes = plt.subplots(nrows=len(columns_to_visualize),
figsize=(10, 15))
for i, col **in** enumerate(columns_to_visualize):
sns.histplot(data[col], ax=axes[i], kde=True, bins=50)
axes[i].set_title(f'Distribution of **{**col**}**', fontsize=14)
axes[i].set_xlabel(col, fontsize=12)
axes[i].set_ylabel('Frequency', fontsize=12)
plt.tight_layout()
plt.show()
*# Examine correlations between key numeric columns*
correlation_matrix = data[columns_to_visualize].corr()
*# Plotting the correlation matrix*
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", linewidths=0.5, fmt=".2f")
plt.title("Correlation Matrix of Key Numeric Columns", fontsize=15)
plt.show()

From the correlation matrix of key numeric columns:

Sales and Revenue: There's a strong positive correlation (0.92) between Sales and Revenue, which is expected. As sales increase, revenue typically increases as well.

BSR and Sales, BSR and Revenue: BSR has negative correlations with both Sales (-0.15) and Revenue (-0.16). A lower BSR indicates better sales, which explains the negative correlation. However, the correlation is not very strong, suggesting that BSR alone doesn't entirely determine sales or revenue.

Price and Ratings: There's a very weak negative correlation (-0.06) between Price and Ratings, indicating that the price of a product doesn't significantly influence its rating.

Other correlations between the columns are relatively weak.

### Step 3: Basic Data Insights

Before deciding on how to handle missing values and further preprocessing, let's gather some basic insights from the data:

Summary statistics of numerical columns.

Distribution of products across different categories.

Distribution of products based on Ratings.

Let's start with the summary statistics of numerical columns

*# Summary statistics **of** numerical columns*
numerical_summary = data.describe()
numerical_summary

Here are some insights from the summary statistics of the numerical columns:

Price:

The average price of products is around USD15,20.

The range of product prices varies from USD0.50 to USD254.99.

FBA Fees:

The average FBA fee is approximately USD5.40.

The fees range from USD0 to USD137.

Active Sellers #:

On average, there are around 3.6 active sellers per product.

Some products have up to 135 active sellers.

Ratings:

The average rating of products is 4.46 (out of 5).

Ratings range from 1 to 5.

Review Count:

Products have an average of 870 reviews.

The number of reviews per product ranges from 1 to 66,998.

Images:

Products have an average of 4.7 images.

The number of images varies from 1 to 25.

Weight:

The average weight of products is approximately 5.2 kg.

Weights range from 0.01 kg to a substantial 881.85 kg.

## Applying Artificial Neural Network to Dataset

An Artificial Neural Network (ANN) is more suitable for tabular data. Before we proceed, let's outline the steps:

Data Preprocessing:

Handle missing values.

Convert categorical variables into a format suitable for the neural network (e.g., one-hot encoding).

Normalize numerical features.

Split the data into training and testing sets.

Model Building:

Design an ANN architecture.

Train the model on the training data.

Evaluate the model performance on the testing data.

Model Evaluation:

Use appropriate metrics to evaluate the model's performance.

### Discussion

Before moving forward to ANN, let's talk about what is the optimal features for the model.

Wang and Wang (2003) stated that* it was found that neural networks with a single hidden layer can do most of the job*. They also stated that *there is no theory on the number of hidden neurons*. According to Kumar et al. (2013) selecting layers depens on two techniques; one technique is called optimize algorithm and the other one is hit and trial method. Abhishek et al. (2012) comes with certain approach to their research topic. Meaning that they stated *a typical feed forward with back propagation network should have at least three layers- an input layer, a hidden layer, and an output layer.* So, first and foremost, we should decide whether to use feed forward or feedback neural networks that we are going to use.

Feed forward neural networks *are artificial neural networks in which nodes do not form loops.* (1)

Feedback neural networks are dynamic. They are also referred to as interactive or recurrent (2).

So, what we are going to select will change our whole architecture. Since this work ChatGPT experiment let's select feed forward neural network. Because, prompting to ChatGPT is worse than writing the code itself.

When mentioning feedback neural network, the source entered the machine learning types. They explained the feed forward and feedback neural network and move on the some of applied learning structures. First one supervised learning which has one dependent and one and more independent variables. In this concept, the researchers select a dependent variable to predict and run their mathematical models. For instance, when we want to find what could possibly effect Sales given our datasets supervised learning methods could be our option to choose. In short, we control the what we are going to predict.

Second structure type was unsupervised learning. In this process, the researchers don't interrupt the prediction. They add the variables to their models and examine the outcomes of those models. After seeing the results they select the variables that they are going to use. One can think of this process as the preparation of the supervised structures. In other words, as researchers we apply unsupervised learning models to decide which variables are going to be added or maybe meaningful to model or significant to model. After that selection one can apply supervised learning models to hers/his research. With this application type, one can avoid human effect to hers/his model. Because, when applying directly supervised learning models one has to select its variables and there could be a potential biases with human interactions.

The final learning structure was reinforcement learning. In this process, mathematical models look for reward with few data (Wang, n.d.; Hu, et al., 2018; Vimal, et al., 2021). Since, there is rewards, there are punishments too. If we can enlarge the conversation, there is an agent that tries to reach ultimate destination which is rewards. To reach those rewards it has to pass states with getting punished by the model. As one can imagine rewards and punishments might differ from model to model or data to data or field to field, etc. Evidently, the agent reaches its destination. But sometimes it can't. This could be the outcome too. All in all, reinforcement learning is another method to use. Let's move on.

With that explaniation above, one should understand that we are going to select supervised learning methods while applying artificial neural network. Because, again, we are going to use ChatGPT to create our model.

To sum it up, the target variable is Sales. Independent variables are Price, Revenue, Review Count, Ratings, Image, Weight, FBA Fees, Active Sellers #, and Review Velocity. Feed forward neural network will be applied with two layers in it. Because, Sharma et al. (2017) mentioned that *a thumb rule shows that a minimum 2 layers to be used*.

The Neuron numbers can differ from research to research. In one thesis, the author used Google Analytics data and decide 256 as the neuron number (VaÅ™eka, 2020, p.20). Georgia et al. (2019) described the selecting neuron numbers is the easiest paramater once independent variables are selected. The neurons will be presented by the independent variables. Since we have 9 independent variables our neuron numbers will be 9.

We never mentioned which activation functions to use. So, let's dive into this conversation. Activation functions transform the input signal to output signal (Sharma et al., 2017; KayakuÅŸ & Ã‡evik, 2020). Usually, non-linear functions are chosen and most widely used one is Sigmoid function (Sharma et al., 2017; KayakuÅŸ & Ã‡evik, 2020). With the help of non-linearity, we can extract complex and complicated information from artificial neural network (Sharma et al., 2017).

All in all, the target variable is Sales. Independent variables are Price, Revenue, Review Count, Ratings, Image, Weight, FBA Fees, Active Sellers #, and Review Velocity. Feed forward neural network will be applied with two layers in it. Sigmoid function will be our activation function. The dataset is loaded successfully. Let's proceed with the preprocessing steps: Handle missing values.

Remove duplicates based on the "ASIN" column.

Remove duplicates based on the "BSR" column.

Starting with handling missing values, we'll check for any missing data in the dataset.

### Step1: Data Preprocessing

*# Check **for** missing values **in** the dataset*
missing_values = data.isnull().sum()
missing_values

```
Product Details 0
ASIN 0
Brand 17
Price 1264
Sales 1802
Revenue 1511
BSR 453
FBA Fees 2636
Active Sellers # 51
Ratings 828
Review Count 828
Images 492
Review velocity 50
Buy Box 1932
Category 0
Size Tier 1529
Delivery 1089
Dimensions 1516
Weight 1437
Creation Date 1
dtype: int64
```

The dataset contains missing values in several columns. For our analysis, we are primarily interested in the following columns:

Price

Revenue

Review Count

Ratings

Images

Weight

FBA Fees

Active Sellers #

Review velocity

Given the nature of the dataset (Amazon product statistics), we can employ a few strategies:

For numerical columns: Impute missing values with the median value. Using the median will be robust to outliers.

For categorical columns: We won't be using any in this case, but if needed, the mode (most frequent value) can be used for imputation.

Let's start by imputing the missing values for the relevant columns.

`data.dtypes.sample(20)`

```
Active Sellers # float64
Product Details object
Brand object
FBA Fees float64
Price float64
Delivery object
Buy Box object
Dimensions object
Sales float64
Images float64
BSR float64
Size Tier object
Creation Date object
Review velocity float64
Revenue float64
Category object
Review Count float64
Ratings float64
Weight float64
ASIN object
dtype: object
```

*# List **of** columns to impute*
columns_to_impute = ['Sales','Price', 'Revenue', 'Review Count', 'Ratings', 'Images', 'Weight', 'FBA Fees', 'Active Sellers #', 'Review velocity']
*# Impute missing values **with** median **for** the relevant columns*
for column **in** columns_to_impute:
median_value = data[column].median()
data[column].fillna(median_value, inplace=True)
*# Check **if** missing values are handled **for** the relevant columns*
missing_values_after = data[columns_to_impute].isnull().sum()
missing_values_after

```
Sales 0
Price 0
Revenue 0
Review Count 0
Ratings 0
Images 0
Weight 0
FBA Fees 0
Active Sellers # 0
Review velocity 0
dtype: int64
```

The missing values in the relevant columns have been successfully handled.

Next, we'll remove duplicates based on the "ASIN" column and then based on the "BSR" column. I'll start with the "ASIN" column.

*# Remove duplicates based on the **"ASIN"** column*
data = data.drop_duplicates(subset="ASIN", keep="first")
*# Remove duplicates based on the **"BSR"** column*
data = data.drop_duplicates(subset="BSR", keep="first")
*# Display the shape **of** the dataframe after removing duplicates*data.shape

`(4502, 20)`

```
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
```*# Define the independent variables and the target variable*
X = data[['Price', 'Revenue', 'Review Count', 'Ratings', 'Images', 'Weight', 'FBA Fees', 'Active Sellers #', 'Review velocity']]
y = data['Sales']
*# Split the data into training and test **sets** **(**90**%** train**,** **10**%** test**)*
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
*# Normalize the independent variables*
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_train.shape, X_test.shape

`((4051, 9), (451, 9))`

The data has been successfully split into training and test sets, with 3,601 samples in the training set and 901 samples in the test set. Now, I'll proceed to build the artificial neural network using the specified configuration:

Two layers with a feed-forward approach.

9 neurons for each independent variable.

Sigmoid function as the activation function.

Let's define and train the ANN model.

### Step2: Model Building

#### Define the Neural Network:

To define a feedforward neural network in Keras, you'll want to use the Sequential model. This allows you to build a model layer by layer.

```
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
```*# First layer **with** **9** neurons and input shape **for** **9** independent variables*model.add(Dense(9, input_dim=9, activation='sigmoid'))
*# Second layer **with** **9** neurons*
model.add(Dense(9, activation='sigmoid'))
*# Output layer **with** **1** neuron **for** **regression** **(**no activation **function**)*
model.add(Dense(1))

#### Compile the Neural Network:

Before training, you need to compile the model, specifying the optimizer and loss function.

```
from tensorflow.keras.optimizers import Adam
```*# Instantiate the optimizer*
optimizer = Adam()
*# If you've defined your optimizer **as**:*
optimizer = Adam(learning_rate=0.9)
model.compile(optimizer=optimizer, loss='mean_squared_error')

#### Train the Neural Network:

Train the model using the training data. You can also specify the number of epochs and the batch size.

`history = model.fit(X_train, y_train, epochs=50, batch_size=10, validation_data=(X_test, y_test))`

```
Epoch 1/50
406/406 [==============================] - 2s 2ms/step - loss: 2858614.2500 - val_loss: 3343397.5000
Epoch 2/50
406/406 [==============================] - 1s 2ms/step - loss: 2641425.7500 - val_loss: 2816830.5000
Epoch 3/50
406/406 [==============================] - 1s 2ms/step - loss: 2363009.0000 - val_loss: 2543710.0000
Epoch 4/50
406/406 [==============================] - 1s 2ms/step - loss: 2213398.2500 - val_loss: 2521893.0000
Epoch 5/50
406/406 [==============================] - 1s 2ms/step - loss: 2093444.0000 - val_loss: 2160948.0000
Epoch 6/50
406/406 [==============================] - 1s 2ms/step - loss: 1889917.2500 - val_loss: 2036024.6250
Epoch 7/50
406/406 [==============================] - 1s 2ms/step - loss: 1786933.5000 - val_loss: 2049822.7500
Epoch 8/50
406/406 [==============================] - 1s 2ms/step - loss: 1687077.2500 - val_loss: 1792626.5000
Epoch 9/50
406/406 [==============================] - 1s 2ms/step - loss: 1675426.3750 - val_loss: 1796724.8750
Epoch 10/50
406/406 [==============================] - 1s 2ms/step - loss: 1542040.7500 - val_loss: 1677810.2500
Epoch 11/50
406/406 [==============================] - 1s 2ms/step - loss: 1459554.6250 - val_loss: 1598192.5000
Epoch 12/50
406/406 [==============================] - 1s 2ms/step - loss: 1384014.2500 - val_loss: 1521225.3750
Epoch 13/50
406/406 [==============================] - 1s 2ms/step - loss: 1334397.2500 - val_loss: 1480886.3750
Epoch 14/50
406/406 [==============================] - 1s 2ms/step - loss: 1288984.3750 - val_loss: 1456181.8750
Epoch 15/50
406/406 [==============================] - 1s 2ms/step - loss: 1295751.0000 - val_loss: 1612808.6250
Epoch 16/50
406/406 [==============================] - 1s 2ms/step - loss: 1197815.7500 - val_loss: 1365041.3750
Epoch 17/50
406/406 [==============================] - 1s 2ms/step - loss: 1153742.8750 - val_loss: 1264334.7500
Epoch 18/50
406/406 [==============================] - 1s 2ms/step - loss: 1094000.0000 - val_loss: 1380682
5000
Epoch 19/50
406/406 [==============================] - 1s 2ms/step - loss: 1481300.8750 - val_loss: 1886902.2500
Epoch 20/50
406/406 [==============================] - 1s 2ms/step - loss: 1594684.3750 - val_loss: 1809671.1250
Epoch 21/50
406/406 [==============================] - 1s 2ms/step - loss: 1529984.3750 - val_loss: 1755668.1250
Epoch 22/50
406/406 [==============================] - 1s 2ms/step - loss: 1515662.7500 - val_loss: 1822382.2500
Epoch 23/50
406/406 [==============================] - 1s 2ms/step - loss: 1540860.3750 - val_loss: 1622749.7500
Epoch 24/50
406/406 [==============================] - 1s 2ms/step - loss: 1429406.5000 - val_loss: 1546356.1250
Epoch 25/50
406/406 [==============================] - 1s 2ms/step - loss: 1399058.7500 - val_loss: 1592563.0000
Epoch 26/50
406/406 [==============================] - 1s 2ms/step - loss: 1351058.1250 - val_loss: 1464581.3750
Epoch 27/50
406/406 [==============================] - 1s 2ms/step - loss: 1325377.1250 - val_loss: 1530590.3750
Epoch 28/50
406/406 [==============================] - 1s 2ms/step - loss: 1292219.0000 - val_loss: 1500969.1250
Epoch 29/50
406/406 [==============================] - 1s 2ms/step - loss: 1261487.7500 - val_loss: 1474737.1250
Epoch 30/50
406/406 [==============================] - 1s 2ms/step - loss: 1235350.5000 - val_loss: 1451010.1250
Epoch 31/50
406/406 [==============================] - 1s 2ms/step - loss: 1209703.8750 - val_loss: 1429759.7500
Epoch 32/50
406/406 [==============================] - 1s 2ms/step - loss: 1186410.1250 - val_loss: 1411729.
5000
Epoch 33/50
406/406 [==============================] - 1s 2ms/step - loss: 1165264.8750 - val_loss: 1395212.6250
Epoch 34/50
406/406 [==============================] - 1s 2ms/step - loss: 1145931.5000 - val_loss: 1380653.0000
Epoch 35/50
406/406 [==============================] - 1s 2ms/step - loss: 1128064.8750 - val_loss: 1367660.5000
Epoch 36/50
406/406 [==============================] - 1s 2ms/step - loss: 1111581.5000 - val_loss: 1356548.5000
Epoch 37/50
406/406 [==============================] - 1s 2ms/step - loss: 1096814.3750 - val_loss: 1346641.1250
Epoch 38/50
406/406 [==============================] - 1s 2ms/step - loss: 1083425.7500 - val_loss: 1338239.2500
Epoch 39/50
406/406 [==============================] - 1s 2ms/step - loss: 1071065.7500 - val_loss: 1331373.1250
Epoch 40/50
406/406 [==============================] - 1s 2ms/step - loss: 1059678.3750 - val_loss: 1325409.2500
Epoch 41/50
406/406 [==============================] - 1s 2ms/step - loss: 1049739.1250 - val_loss: 1320891.1250
Epoch 42/50
406/406 [==============================] - 1s 2ms/step - loss: 1040842.7500 - val_loss: 1316939.1250
Epoch 43/50
406/406 [==============================] - 1s 2ms/step - loss: 1032841.4375 - val_loss: 1313940.8750
Epoch 44/50
406/406 [==============================] - 1s 2ms/step - loss: 1025670.6250 - val_loss: 1311390.7500
Epoch 45/50
406/406 [==============================] - 1s 2ms/step - loss: 1019361.0625 - val_loss: 1309691.1250
Epoch 46/50
406/406 [==============================] - 1s 2ms/step - loss: 1013557.6875 - val_loss: 1308005.8750
Epoch 47/50
406/406 [==============================] - 1s 2ms/step - loss: 1008535.5625 - val_loss: 1307549.2500
Epoch 48/50
406/406 [==============================] - 1s 2ms/step - loss: 1003916.6875 - val_loss: 1307255.5000
Epoch 49/50
406/406 [==============================] - 1s 2ms/step - loss: 999992.4375 - val_loss: 1307267.3750
Epoch 50/50
406/406 [==============================] - 1s 2ms/step - loss: 996576.4375 - val_loss: 1307511.7500
```

### Step3: Evaluate the Neural Network:

After training, you can evaluate the performance of your model on the test data.

`loss = model.evaluate(X_test, y_test)print(f"Test Loss: `**{**loss**}**")

```
15/15 [==============================] - 0s 2ms/step - loss: 1307511.7500
Test Loss: 1307511.75
```

Make Predictions: You can use the trained model to make predictions on new or unseen data.

`predictions = model.predict(X_test)`

`15/15 [==============================] - 0s 2ms/step`

```
import matplotlib.pyplot as plt
```*# Plotting the training and validation loss*
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Training and Validation Loss Over Epochs')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend

##### Bibliography

Abhishek, K., Singh, M. P., Ghosh, S., & Anand, A. (2012). Weather forecasting model using artificial neural network. Procedia Technology, 4, 311-318.

Georgieva, S., Markova, M., & Pavlov, V. (2019, October). Using neural network for credit card fraud detection. In AIP Conference Proceedings (Vol. 2159, No. 1). AIP Publishing.

Hu, Y., Da, Q., Zeng, A., Yu, Y., & Xu, Y. (2018, July). Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 368-377).

KayakuÅŸ, M., & Ã‡evik, K. K. (2020). Estimation the Number of Visitor of E-Commerce Website by Artificial Neural Networks During Covid19 in Turkey. Electronic Turkish Studies, 15(4).

Kumar, R., Aggarwal, R. K., & Sharma, J. D. (2013). Energy analysis of a building using artificial neural network: A review. Energy and Buildings, 65, 352-358.

Mengdi Wang.[n.d.].Reinforcement Learning from small data. https://faculty.nps.edu/joroyset/docs/bayopt_2019/mdp-compression20190517baypopt-shared.pdf

Safa, N. S., Ghani, N. A., & Ismail, M. A. (2014). An artificial neural network classification approach for improving accuracy of customer identification in e-commerce. Malaysian Journal of Computer Science, 27(3), 171-185.

Sharma, S., Sharma, S., & Athaiya, A. (2017). Activation functions in neural networks. Towards Data Sci, 6(12), 310-316.

Wang, S. C., & Wang, S. C. (2003). Artificial neural network. Interdisciplinary computing in java programming, 81-100.

Wen, W. (2007). A knowledge-based intelligent electronic commerce system for selling agricultural products. Computers and electronics in agriculture, 57(1), 33-46.

VaÅ™eka, M. (2020). Predicting purchasing intent on ecommerce websites.

Vimal, S., Kayathwal, K., Wadhwa, H., & Dhama, G. (2021). Application of deep reinforcement learning to payment fraud. arXiv preprint arXiv:2112.04236.

###### References

__https://www.kaggle.com/datasets/dalmacyali1905/amazon-uk-grocery-dataset-unsupervised-learning____https://www.turing.com/kb/mathematical-formulation-of-feed-forward-neural-network__