Recent Progress in Science and Engineering is an international peer-reviewed Open-Access journal published quarterly online by LIDSEN Publishing Inc. It aims to provide an advanced knowledge platform for Science and Engineering researchers, to share the recent advances on research, innovations and development in their field.

The journal covers a wide range of subfields of Science and Engineering, including but not limited to Chemistry, Physics, Biology, Geography, Earth Science, Pharmaceutical Science, Environmental Science, Mathematical and Statistical Science, Humanity and Social Science; Civil, Chemical, Electrical, Mechanical, Computer, Biological, Agricultural, Aerospace, Systems Engineering. Articles of interdisciplinary nature are also particularly welcome.

The journal publishes all types of articles in English. There is no restriction on the length of the papers. We encourage authors to be concise but present their results in as much detail as necessary.

Current Issue: 2025
Open Access Original Research

Machine Learning Analysis of the Impact of Increasing the Minimum Wage on Income Inequality in Spain from 2001 to 2021

Marcos Lacasa-Cazcarra * ORCID logo

  1. Universidad Internacional de La Rioja, Avenida de la Paz, 137, 26006 Logroño, La Rioja, Spain

Correspondence: Marcos Lacasa-Cazcarra ORCID logo

Academic Editor: Kaya Kuru

Received: July 31, 2024 | Accepted: March 19, 2025 | Published: March 25, 2025

Recent Prog Sci Eng 2025, Volume 1, Issue 1, doi:10.21926/rpse.2501005

Recommended citation: Lacasa-Cazcarra M. Machine Learning Analysis of the Impact of Increasing the Minimum Wage on Income Inequality in Spain from 2001 to 2021. Recent Prog Sci Eng 2025; 1(1): 005; doi:10.21926/rpse.2501005.

© 2025 by the authors. This is an open access article distributed under the conditions of the Creative Commons by Attribution License, which permits unrestricted use, distribution, and reproduction in any medium or format, provided the original work is correctly cited.

Abstract

The main aim of this work is to analyze the impact of the National Minimum Wage (NMW) on wealth redistribution in Spain during the 2001-2021 period. This research addresses a fundamental question: Does increasing the minimum wage effectively reduce income inequality and contribute to a more equitable wealth distribution without negatively impacting overall economic growth and employment? Using a complete statistical census provided by the Spanish Tax Administration Agency, this study examines how the increase in the NMW—which rose from €505.7/month (2001) to €1,108.3/month (2021)—affected income inequality and other macroeconomic indicators. A distinctive feature of this research is the use of a complete national census database, not a sample or projection, which provides more accurate results and more efficient predictive models that reflect the studied population. Through various machine learning models, it is demonstrated that income inequality has been reduced by raising the minimum wage. Contrary to the predictions of previous economic analyses, the increase in the minimum wage has not led to inflation or increased unemployment. On the contrary, it has been consistent with increased net employment, contained prices, and increased corporate profit margins. The main conclusion is that the increase in the minimum wage during the analyzed period has contributed to an effective redistribution of wealth, simultaneously increasing the country’s prosperity, employment, and business profits under the analyzed conditions.

Keywords

Machine learning; minimum wage

1. Introduction

1.1 Minimum Wages and Wealth Redistribution: A Review of Evidence from the United States and Europe

The empirical literature on the effects of minimum wages on income redistribution has evolved considerably in recent decades, with different methodological approaches emerging in the United States and Europe. These different research traditions reflect data availability constraints and fundamental differences in labor market institutions across regions. In the United States, studies mainly rely on the Current Population Survey (CPS), a monthly household survey of about 60,000 households. The CPS provides researchers with consistent longitudinal wage and employment data. However, it remains subject to potential reporting errors and measurement challenges in the lower tail of the wage distribution, where minimum wage effects are concentrated. U.S. researchers have developed increasingly sophisticated econometric techniques to address these limitations, progressing from [1] state-level variation approach, which attributed approximately 70% of lower-half inequality growth to minimum wage declines [2] semiparametric counterfactual distributions [3] instrumental variables corrections for measurement error, and most recently [4] bunching estimator methodology tracking the entire distribution of jobs around minimum wage increases. These methodological refinements have moderated earlier conclusions, estimating that minimum wages explain 30-40% of lower-tail inequality growth—substantial but more conservative than previous findings [3]. Perhaps most notably analysis of 138 state-level minimum wage changes found increases in low-wage workers’ earnings without corresponding employment reductions, challenging conventional theoretical predictions [4]. European research takes a markedly different approach, using primarily the European Structure of Earnings Survey (ESES), a four-year survey of enterprises in EU Member States. Unlike the CPS, the ESES collects wage information directly from employers rather than households, potentially offering greater accuracy but with less frequent observations. European minimum wage systems exhibit considerable institutional heterogeneity, ranging from statutory national minimum wages to sectoral minimum wages set through collective bargaining. Consequently, European studies emphasize comparative institutional analysis rather than quasi-experimental designs, employing variance decomposition techniques and institutional indicators to assess how different minimum wage systems affect earnings inequality [5]. These different methodologies have produced contrasting but complementary findings. While US studies use within-country variation to identify causal effects, European research highlights how institutional design features - including enforcement mechanisms, coverage rates, and interactions with collective bargaining - critically determine the effectiveness of minimum wages. European evidence suggests that statutory minimum wages reduce inequality more effectively than collective bargaining systems when bargaining coverage is incomplete. At the same time, compliance problems significantly undermine potential redistributive effects, particularly in Eastern European countries. Despite methodological differences, both research traditions show that minimum wages compress wage distributions, mainly affecting the lower half of the distribution, with the strongest effects between the 10th and 30th percentiles. The evidence also suggests that moderate increases in the minimum wage can reduce inequality without necessarily causing substantial employment losses. However, the precise mechanisms and magnitudes vary according to the institutional context. This developing understanding of the effects of minimum wages highlights the need for further methodological improvements. It suggests that future research should further explore the interplay between minimum wages and broader policy settings, heterogeneous effects across groups of workers, and long-term effects on economic mobility.

1.2 The Multidimensional Effects of Minimum Wage on Labor Markets and Prices

This subsection analyzes six relevant studies that examine minimum wage effects in the United States, Europe, and other contexts. In the American context, it’s shown that a theoretical analysis of firms’ adjustment mechanisms in response to minimum wage increases highlights the importance of non-employment margins [6]. This analysis suggests that firms implement heterogeneous responses including price adjustments, productivity, and internal organization, often avoiding direct personnel reductions. Using data from the Current Population Survey (CPS) to identify those benefiting from rising wages, the report concludes that low- and middle-income workers stand to gain the most [7]. In contrast, [8], using a rotating CPS panel (1979-1997), argues that positive wage effects are partially counteracted by adverse employment effects for low-income workers. Complementarily, macroeconomic time series have been analyzed to examine the impact on inflation, and moderate effects on prices and other key macroeconomic variables have been found [9].

The European perspective is represented by [10], who use German administrative records from the Institute for Employment Research (IAB) to analyze labor reallocation effects after implementing the national minimum wage in Germany. Their difference-in-differences methodology reveals that, rather than net employment reduction, significant reallocation of workers occurred between firms and sectors, with complex distributional consequences.

Outside the Western context, [11] apply cointegration and causality techniques to the Turkish macroeconomic time series (1988-2013), identifying significant effects on unemployment, inflation, and economic growth in a middle-income country with distinctive institutional characteristics.

A relevant methodological aspect concerns the type of databases used. American studies often rely on surveys such as the CPS, but use its rotating structure to generate a limited panel. European research, on the other hand, makes use of comprehensive longitudinal administrative records, which allow precise tracking of individual work histories and firm responses. Analyzed studies employ aggregate time series that function as panels at the national level [9,11].

1.2.1 Convergences and Divergences in Empirical Evidence on Minimum Wage Effects

The reviewed literature presents both significant convergences and divergences. Among the similarities, recognizing heterogeneous minimum wage effects according to worker characteristics, firms, and economic sectors stands out [7,8,10]. Likewise, multiple investigations coincide on the importance of non-employment adjustments, such as price modifications, productivity, and business reorganization. Macroeconomic studies agree on complex effects on inflation and economic growth. However, specific nuances depend on the national context.

The most notable divergences manifest in conclusions about employment effects. While [8] and [11] identify negative impacts for low-income workers, [10] emphasize reallocation effects rather than net job losses. Similarly, there are discrepancies regarding the distribution of benefits. [7] highlight substantial gains for low and middle-income workers, while [8] warn about negative compensations via employment reduction. Regarding price effects, [9] finds moderate inflationary increases in the United States, while [11] documents more pronounced effects in the Turkish context.

The analyzed evidence suggests that minimum wage effects are highly contextual, depending on the established level, labor market institutions, and general economic conditions. American studies focus on distributive and employment impacts at the microeconomic level, while European and Turkish research emphasizes reallocation processes and macroeconomic consequences. Methodologically, analyses based on longitudinal administrative data, such as the German study [10], offer particularly valuable perspectives by allowing detailed tracking of individual labor trajectories and business responses over time.

In the Spanish context, it is been provided insights using longitudinal administrative microdata from Spanish Social Security records, which constitute a comprehensive panel dataset tracking individual workers over time (MCVL - Portal Estadisticas - Seguridad Social, n.d.). Their analysis of the 2017 minimum wage increase forecasts heterogeneous employment effects across age groups, with the most pronounced negative impacts concentrated among younger workers (under 25) and older workers (over 45). Their difference-in-differences methodology projects that directly affected workers face a 0.6-1.1 percentage point higher probability of job loss than similar unaffected workers. This granular panel data approach enables precise identification of vulnerable demographic segments within the labor market while controlling for individual fixed effects [12].

1.3 Leveraging Official Census Statistical Data: A Comparative Analysis with the Present Study

1.3.1 Methodological Approaches to Administrative Data

The use of comprehensive administrative datasets is increasingly recognized as a methodological cornerstone in contemporary economic research on minimum wage effects and wealth redistribution. This section examines two notable European studies that use official census data comparable to the Spanish analysis presented in this paper.

[13] based their analysis on Estonian Tax and Customs Board administrative records, which can be characterized as a complete statistical census of the Estonian workforce. These data are derived from monthly employers’ tax declarations combined with individual tax records, providing exhaustive earnings coverage for all formal employment arrangements from 2001 to 2014. The temporal coverage of the Estonian dataset includes several minimum wage adjustments, allowing a detailed examination of changes in the wage distribution over different economic cycles. Notably, these administrative records were processed individually, allowing researchers to control for compositional changes in the labor force, a methodological advantage not available in aggregate datasets.

Similarly, [14] employed comprehensive French administrative tax records to analyze predistribution versus redistribution policies. These official data sources included complete income tax declarations with granular information on diverse income streams (labor, capital, transfers), enabling sophisticated analysis of pre-tax and post-tax inequality measures. A distinctive feature of the French dataset was its complete coverage of the entire income distribution spectrum, including high-income households, which are often under-represented in survey-based studies. The administrative nature of these data allowed precise tracking of income changes in different segments of the population over long periods.

The present Spanish study similarly relies on comprehensive administrative tax records from the Spanish Tax Administration Agency (AEAT), constituting a complete census of all employees in Spain from 2001 to 2021.

1.3.2 Comparative Analytical Techniques

Methodological differences among these studies merit detailed examination. [13] implemented counterfactual distribution analysis to isolate minimum wage effects from other factors influencing wage inequality. Their approach involved decomposing wage distribution changes into three components: minimum wage effects, compositional changes, and residual factors. Statistical significance was established through bootstrap procedures, and robustness was confirmed through alternative specification testing. The Estonian researchers focused primarily on distributional consequences rather than employment effects, differentiating their approach from many other minimum wage studies.

[14] employed distributional national accounts methodologies pioneered by Piketty et al., integrating tax data with national accounts to ensure comprehensive income coverage. Their analytical framework was designed to distinguish between predistributive factors (affecting pre-tax inequality) and redistributive mechanisms (tax and transfer systems). Concentration indices and decomposition techniques were utilized to quantify the relative contributions of different policy approaches to overall inequality reduction. Their methodology explicitly addressed the challenge of comparing countries with other institutional arrangements by developing standardized measures applicable across contexts.

This work introduces methodological innovations by applying machine learning techniques and network analysis not employed in Estonian or French studies. As detailed in the methodology section, graph theory was used to analyze the relationships between macroeconomic variables, and a Random Forest Regressor was implemented to identify non-linear relationships and variable importance in predicting inequality measures. This machine learning approach represents a methodological advancement beyond the traditional econometric techniques employed in comparative studies.

1.3.3 Convergent and Divergent Conclusions

The findings from these studies utilizing official administrative data reveal important similarities and notable differences. [13] concluded that minimum wage increases significantly compressed wage distribution in Estonia, with benefits primarily accruing workers in the lower three deciles. Their evidence revealed positive spillover effects extending to approximately the 20th percentile of wage distribution. It was determined that these wage compression effects were achieved without detectable negative employment consequences. However, the authors acknowledged potential limitations in their ability to identify employment effects given their methodological focus on wage distribution.

The researchers concluded that minimum wage increases have been found to reduce wage inequality in Estonia by raising the wage floor and generating positive spillover effects that extend to the 20th percentile of the wage distribution. These effects were observed without any evidence of a reduction in employment.

[14], while adopting a broader focus than exclusively examining minimum wage effects, provided crucial insights regarding how labor market institutions influence pre-tax inequality. Their comparative analysis demonstrated that France’s interventionist approach to predistribution, including relatively high minimum wages and strong collective bargaining institutions, had proven more effective at constraining pre-tax inequality than the market-oriented approach observed in the United States. The researchers emphasized the limitations of purely redistributive policies if not accompanied by pre-distributive measures to address market income inequalities.

Their main finding was that the contrast between France and the United States suggests that predistributive policies may be more effective than redistribution in achieving lower levels of income inequality. The divergence in pre-tax income inequality between these countries is primarily due to differences in labor market institutions rather than differences in redistributive mechanisms.

This work is consistent with the above findings, extending them in several dimensions. The conclusion that an increase in the minimum wage over the period analyzed has led to an increase in the wealth of the nation, fostered a rise in employment and corporate profits, and, given the conditions investigated, is proposed as an effective means of redistributing wealth, confirms the positive assessment of minimum wage effects presented in both comparative studies.

In Spain the setting of the national minimum wage (MNW) is essentially a political decision. Progressive governments, especially in recent years, have implemented substantial increases. This paper examines the impact of MNW policy from 2001 to 2021, during which the MNW will increase from €505.7/month (2001) to €1,108.3/month (2021). The distinguishing feature of this study is its extraordinary database: comprehensive administrative tax records provided by the Spanish Tax Agency (Agencia Tributaria). Unlike previous studies based on surveys or representative samples, this database represents a complete census of all employees in Spain. This methodological advantage eliminates the problems of sampling error and statistical inference inherent in sample-based studies, thus significantly increasing the precision and reliability of our conclusions. The analyses and results presented here therefore reflect actual labor market dynamics derived from the exhaustive Spanish wage census for the period 2001-2021, rather than projections or estimates.

2. Methods

2.1 The Database

The Spanish Tax Administration Agency (Spanish: Agencia Estatal de Administración Tributaria, AEAT) is a public institution. It is attached to the Ministry of Economy and Finance through the former State Secretariat for Finance and Budget. It handles the application of the tax system under the constitutional principle that everyone must contribute to supporting public expenditure according to their economic capacity. It prepares the annual tax collection reports, which provide information on the amount and yearly evolution of the tax revenues managed by the AEAT. It offers a database in Excel format called "Distribucion salarios" in the supplementary material. This database collects information on employees in Spain who receive income from a company or entity required to provide a list of those receiving such income. This database does not include information on households that pay wages to employees in the household, nor does it reflect the reported wages of the self-employed.

Payments are defined as income, in cash or kind, paid by the reporting unit (company or institution) in the form of annual revenue. Employees are included in the database even if they worked for only one day. If an employee has worked for more than one company or entity, the amount shown is the sum of all payments made to that employee by the different companies from which he/she received his/her salary. The information is completely anonymous and is presented in intervals of $200, from $0 to $80,000. The last interval is open, with no maximum value specified. A total of 400 annual salary levels are presented. Each cell represents the value corresponding to each bracket by year (2001-2021). Three variables are analyzed: number of employees, salary, and withholding tax.

2.2 Additional Macroeconomic Data

This paper analyzes other macroeconomic data regularly published by the National Statistics Institute (INE), such as MNW, Consumer Price Index, Unemployment Index, Gross Domestic Product, and Public Debt. In no case are the data deflated, as they are all nominal.

2.3 Calculation of the Gini Index

The formula used to calculate the Gini index is:

\[ G=\frac{x}{n^2\bar{x}}\sum_{i=1}^ni\left(x_i-\bar{x}\right) \]

To calculate the Gini index for a year, the number of registered employees in that year is needed. The database provides this information in brackets. The number of employees with a gross annual income equal to the interval, the total gross income, and the total income tax withheld are provided for each interval (range of $200). The mean is considered to be the distribution value that makes the variance zero since the probability distribution of each ith interval is considered to be identically distributed. This results in a vector for each interval composed of j values that are identical and equal to the mean of that interval.

\[ Vector\,i^{th}:\,n^o\,employees(j)\,\bar{x}_\iota \]

Each year vector is a union of all interval vectors of this year.

2.4 Graph Analysis

Graph theory was used to analyze the relationships between macroeconomic variables. A graph is a collection of nodes (also called vertices) connected by edges (undirected) [15]. The pattern of interactions between the nodes (individuals or entities) can be captured through the graph structure. The purpose of graph (or network) analysis is the study of relationships between individuals to discover knowledge about global and local structures.

In this paper, the graph nodes are defined as all macroeconomic variables, and the edges are defined as moderate or strong correlations between them. The linear correlation between two nodes is represented by $corr(i,j)$, and the Spearman correlation is defined as moderate or vigorous if $corr(i,j)\geq0.5$ [16] in case of direct correlation. An $edge(i,j)$ is defined if $abs\left(corr(i,j)\right)\geq0.5$.

Detecting communities in networks is one of the most popular topics in modern network science. Communities, or clusters, are typically groups of nodes that are more likely to be connected than to members of other groups, although different patterns are possible. There are no universal protocols, neither for defining a community itself nor for other crucial issues such as validating algorithms and comparing their performance.

The Louvain method hierarchically performs a greedy optimization [17], assigning each vertex to the community of its neighbors that yields the highest number and creating a smaller weighted network whose vertices are the clusters found previously [18]. Partitions found on this super-network hence consist of clusters, including the ones found earlier, and represent a higher hierarchical level of clustering. Software used: Gephi v 0.10 [19].

2.5 Multivariate Linear Regression

The statistical model is assumed to be

\[ Y\,=\,X\beta\,+\,\mu \]

where $\mu\,N(0,\Sigma)$. The ordinary least squares for independent identically distributed errors (MSE). R-square formula is,

\[ R^{2}\,=\,1\,-\,\frac{\mathrm{sum~squared~regression~(SSR)}}{\mathrm{total~sum~of~squares~(SST)}}\,=\,1\,-\,\frac{\sum(y_{i}-\hat{y})^{2}}{\sum(y_{i}-\bar{y})^{2}} \]

The Durbin-Watson (DW) statistic tests for autocorrelation in the residuals from a statistical model or regression analysis. Values from 0 to less than 2 indicate positive autocorrelation and values from 2 to 4 indicate negative autocorrelation.

Durbin-Watson test statistics d is given as,

\[ d\,=\,\frac{\sum_{i=2}^N(e_i-e_{i-1})^2}{\sum_{i=1}^N{e_i}^2} \]

where $N$ is the number of observations and $e_{i}$ is the residual for each observation $\left(i\right)$. The software used is statsmodels packages for Python [20].

2.6 Random Forest Regressor

The Random Forest Regressor (RFR) is an ensemble learning model. It combines the predictions of multiple models to produce more accurate results than a single model [21]. A decision tree (DT) is a simple model that predicts the outcome by performing a partition based on the predictor (input variable) that provides the most significant reduction in mean squared error (MSE).

\[ MSE\,=\,\frac{1}{n}\sum_i^n\,=\,(y_i\,-\,\hat{y})^2 \]

Where $y$ and $\hat{y}$ are the measured and predicted values of the samples in a node, respectively, and n is the number of samples in a node. A node that cannot branch further due to a non-decreasing MSE is called a leaf node, and the average of the samples in that node becomes a candidate for prediction. When unseen data is entered into the final DT model, the data moves according to predetermined branching criteria. The value of the leaf node where the data finally arrives is used as the predicted value of the DT. Scitik-Learn is the machine learning software used in Python [22].

3. Results

3.1 Calculation of the Gini Index

The Gini index of the gross wages received by all workers in the years under study is analyzed and called Gross-Gini. A similar analysis called the net Gini, is carried out on net income (gross salary minus withholding tax). It shows the effect of income tax progressivity as measured by the difference between the two indexes. Table 1 shows the annual results of the database “Distribucion salarios” provided by AEAT. The nominal increase in average annual gross earnings was like the annual increase in minimum wage. Over the twenty years of the study, they increased by €7,500 and €7,300, respectively. The minimum yearly wage percentile is calculated concerning gross annual income. In most years, more than 30% of workers did not earn the equivalent of the annual minimum wage. This is due to a significant level of underemployment. Tourism in Spain is seasonal and accounts for more than 13% of the total employed [23]. Seasonality in the agricultural sector in Spain accounts for 5% of employment [24]. Over the period under review, the progressivity of income tax has remained stable. Income inequality is favored by this progressivity. The difference between the calculated Gini indices indicates the reduction in income inequality brought about by the progressivity of the tax.

Table 1 Macroeconomics time series.

In the following dataset analysis, we reduce the intervals from 400 to 5. To do this, we will increase the intervals from €200 to €20,000. The average withholding tax per interval and its evolution are analyzed over the study period in Figure 1. The average withholding rate shows a general downward trend more pronounced in the lower average wage brackets. Comparing the ratio from 2001 to 2021, the decline is similar in the five three-percentage-point intervals, but the impact on net income is much more significant in the range below 20K (from 6.09% in 2001 to 3.64% in 2021) than in the range between 60K and 80K (29.31% in 2001 to 25.85% in 2021).

Click to view original image

Figure 1 Change in average withholding tax by income ranges. Five intervals of €20,000 from the database are defined. The lines show the change in the average withhold rate over time for each.

3.2 Mean Salary Analysis

The average gross annual salary is calculated for the entire population (Mean Gross Salary) and a calculation of the average gross annual salary for the range [$10,000-$80,000] (Mean Gross Salary Range) is added, i.e., for each year, wages and employees with gross annual salaries below $10,000 and above $80,000 are not considered. Both lines are highly correlated with each other, as well as with GDP in nominal euros, Figure 2. Two periods of economic growth that significantly increase the average wage are characterized: 2001-2008 and 2015-2021.

Click to view original image

Figure 2 Evolution of Average Gross Wage vs. GDP. The blue line represents the average annual gross salary calculated using only the gross salary range [$10,000-$80,000]. The orange dashed line represents the average salary of the total number of employees in the database. The values of the bars represent the value of the GDP (€ trillion). The green dotted line represents the number of employees by year (tens of millions).

3.3 Unemployment Analysis

Figure 3 provides a visual analysis of the intuitive relationship between continuous increases in the minimum wage, especially in recent years, and unemployment. It is observed that youth unemployment has the highest elasticity for periods of economic crisis. No increase in unemployment among workers over 55 years of age is observed to be associated with an increase in the minimum wage. The National Statistics Institute (INE) periodically publishes unemployment figures. Quarterly figures are used in this case.

Click to view original image

Figure 3 Unemployment trends by age and minimum wage 2002-2023. The lines represent the quarterly value of unemployment according to the age of the worker, as collected by the National Statistics Institute (INE). The bars represent the minimum wage as defined by the government.

3.4 Graph Analysis

The following variables are evaluated.

  • Diff.- The difference between the average gross salary of all employees and the gross salary range [10,000-80,000].
  • Employees. - Number of employees by year.
  • GINI. - Net Wage Gini Index.
  • TAXES. - Withholding tax.
  • GDP. - Gross Domestic Product.
  • UnempRate. - Unemployment Rate.
  • NMW. - National Minimum Wage.
  • Mean Salary. - Mean Gross wage in Range.
  • DEBT. - National Debt.

The size of the nodes depends on their degree. The higher the degree, the more the variable is related to the rest. The linear correlation between two nodes $(i,j)$ is represented by $corr(i,j)$. An $edge(i,j)$ is defined if $abs\left(corr(i,j)\right)\geq0.5$. The linear Spearman correlation of each edge is given by its value. The color is defined by the two communities found. The interpretation of the graph shows that the variables of the same color have a stronger relationship with each other. Taxes and GDP are crucial in a relationship with both community variables. GINI is only related to its community variables. The results are shown in Figure 4:

  • Modularity: 0.156.
  • Number of Communities: 2.
  • Average Degree: 4.889.
  • Maximum Nodes Degree: 6 (GPD, TAXES).
  • Minimum Nodes Degree: 3 (GINI).
  • Maximum linear correlation >0.9: (NMW - DEBT), (NMW - Mean Salary), (DEBT - Mean Salary).

Click to view original image

Figure 4 Network visualization. Colors represent communities detected in the graph. The size of nodes is proportional to their degree. Each edge represents a linear Spearman correlation, and the value of the correlation is displayed.

Diff is the only node with multiple negative correlations (Employees, GDP, and TAXES). It is strongly negatively correlated with the number of employees ($\rho =-0.89$). The NMW positively correlates with Mean Salary ($\rho =0.96$).

3.5 Regression Analysis

It is convenient to detail the results through an analytical analysis after a visual analysis of the study. Three types of regressions are analyzed: a multivariate linear regression, a regression model based on machine learning algorithms, and a time series regression model.

3.5.1 Multiple Linear Regression

The P-values and coefficients show which of the relationships in your model are statistically significant and the nature of those relationships. Whether these relationships are statistically significant is indicated by the p-values for the coefficients. Different combinations were examined to develop a model that maximizes the R-squared value and eliminates the variables that lack statistical significance. The results are shown in Table 2. The GINI index of net salaries was used as the dependent variable. The objective is to analyze how the different macroeconomic variables studied will affect income inequality in Spain between 2001 and 2021.

Table 2 Ordinary Least Squares regression results.

3.5.2 Random Forest Regressor

The preparation of the model run included the following steps:

  1. Define the dependent variable, here the Gini index.
  2. Parameterize the model by minimizing the MSE.
  3. Optimal model implementation.

The mean square error was 0.0039. The model was optimized. Feature importance refers to a class of techniques for assigning scores to input features in a prediction model, indicating the relative importance of each feature in making a prediction. Relative scores can be used for subsequent analysis, highlighting which features are most relevant to the target. This information characterizes the model used to make predictions. Low scores suggest eliminating the variable to reduce the dimensionality of the problem. High scores are further analyzed in more detail to be predictive of the model. The importance of the feature is shown in Table 3.

Table 3 Random Forest Regressor Model feature importance.

There is a second method of assessing importance in the regressor model called importance by permutation. This involves randomly shuffling each feature and calculating how the model performs. The features that have the most significant impact on performance are the most important ones. The importance of the permutation does not reflect the intrinsic predictive value of a feature per se but rather the importance of that feature for a particular model. The result is shown in Figure 5.

Click to view original image

Figure 5 Permutation feature importance. The feature importance of the estimators for the dataset is calculated by the permutation importance function. The number of times a feature is randomly shuffled, and a sample of feature importance is returned. After 10 repetitions, the results stayed the same. The red dot represents the mean of the replicates. The line represents the standard deviation of each value.

3.6 Mean Salary Difference Explained

The mean salary was calculated by considering all employees in all groups of the analyzed database. The bands below €10,000 and above €80,000 have been excluded from the calculation of the Mean Salary Range. The first band corresponds to employees who did not work the calendar year and distorts the average. Similarly, the band above €80,000 is open and includes remarkably high salaries, which also distorts the average. For this reason, the variable ’average salary range’ corresponds to the average of employees whose gross annual salary is between [10K-80K]. There has been a shift in the proportions of employees by rank, particularly since 2016, coinciding with more substantial increases in the minimum wage. Over the years, the [20K-40K] range has increased from 5.85% (12.51% of gross income) of the population in 2001 to 15.61% (20.96% of gross income) in 2021. This is because underemployment (workers who want to work more hours and whose contracts do not cover 40 hours a week or all months of the year) exceeds 15% of total employment (Brecha de género en el empleo por tipo de empleo y periodo, n.d.). This analysis is shown in Figure 6.

Click to view original image

Figure 6 Ratio of income to employees per band [2001-2016-2022]. The share of employees per band for the three years analyzed [2001-2016-2022] is shown in the upper part of the graph. The share of the sum of their gross annual income per income group is shown in the lower part of the graph.

Spearman’s linear correlation value between the DIFF variable and GINI was $\rho =0.75$, as shown in Figure 4. DIFF is the most influential in predicting the Gini Index score.

The evolution of the percentile of the minimum wage over the period under study is analyzed and presented in Figure 7, together with other macroeconomic variables. A percentile above 40 is noticeable. This means that 40% of the wages recorded in the database are paid below the Minimum Wage. The evolution of the percentile is related to an increasing minimum wage and the evolution of the average gross salary.

Click to view original image

Figure 7 Ratio of income to employees per band [2001-2016-2022]. Each year, the difference between the average gross salary of all employees and the gross salary range [10,000-80,000] is evaluated and represented by the brown line. The National Minimum Wage is represented by bars in the figure. The cream line represents the Gini index corresponding to the net income values of the employees each year. The blue line represents the evolution of the NMW percentile versus the annual gross salary vector.

4.Discussion

This paper is based on the statistical analysis of the macroeconomic time series published by Spanish government agencies. The Bank of Spain, among others, predicted 2017 that increasing the minimum wage would cause unemployment to rise [12]. Experience has shown that none of these forecasts have been accurate. Figure 3 shows that raising the minimum salary is compatible with creating new jobs. The average wage increases and income inequality becomes fairer. These results are consistent with some studies mentioned in the paper. In the United Kingdom (1999-2010), the development of wage inequality was examined. The impact on inequality of introducing the minimum wage in 1999 was moderate. This can be explained by the fact that the minimum wage was introduced at a level that was below the 10th percentile of the earnings distribution [25]. However, the specific features of the structure of the Spanish economy must be considered. One reason for its impact on reducing income inequality may be that the most recent minimum wage is at the 30-40th percentile.

Extrapolating these results to other economies, or even to Spain shortly, may be unwise. An increase in economic inequality can lead to a lack of trust in governing politicians [26]. In 2022, Piketty et al. [14] suggest that policy discussions on inequality should focus on policies that affect pre-tax inequality rather than exclusively on tax redistribution. This paper shows that redistribution does not change the dynamics of economic inequality. As the minimum wage increases, there is an increase in the number of workers in the $20,000-$40,000 range. This tends to reduce income inequality and raise the average salary. And this is true for the Spanish economy since the minimum wage is far from the level of the leading European economies, such as France or Germany. There is a question about how an increase in the minimum wage will hurt the economy. It could be argued that as it approaches that of Germany, either productivity increases or economic imbalances could arise, which have been widely analyzed. Inequality tends to be pro-cyclical. Low-income households and young people tend to be hit harder by recessions. The distribution of labor and capital income differs across countries. Therefore, the cyclicality of income inequality may also differ [27].

Three different regression models were used. They all confirm that the DIFF variable is the best predictor of income inequality. It is particularly noticeable when there is a movement towards annual salaries above €20,000, mainly due to the increase in the minimum wage. There is reason to believe that the wage structure will follow the same trend as the minimum wage increases, catching up with significant economies if the minimum wage percentile is above 20. In addition, income inequality will improve. This is measured by a decline in the Gini index. The collection of corporate income tax has risen to a record figure of more than 200 billion euros for the first time in 2019 (Recaudación y Estadísticas del Sistema Tributario Español: Ministerio de Hacienda y Función Pública, n.d.). Therefore, there is no evidence of a reduction in corporate profits because of a minimum wage increase.

4.1 Similarities with Study Using Institutional Datasets

Despite differences in national contexts and methodological approaches, the Estonian, French, and Spanish studies show notable convergences in their data structure and conclusions. These common findings strengthen the overall evidence base on the impact of minimum wages on income inequality and redistribution. The Estonian analysis [13] finds that raising the minimum wage has reduced wage inequality by raising the wage floor and generating positive spillover effects. Similarly, this paper analysis shows that income inequality has been reduced by raising the minimum wage. This convergence challenges classical economic models that predict that market interventions necessarily create economic inefficiencies. A second area of convergence concerns the absence of significant adverse employment effects. Contrary to traditional economic predictions, neither the Estonian nor the Spanish study identified substantive employment reductions following minimum wage increases. The third considerable convergence involves recognizing that predistributive measures (affecting pre-tax income distribution) can be more effective than purely redistributive approaches. Despite these convergences, this work presents several distinctive elements in methodology and conclusions and introduces machine learning techniques and network analysis not employed in the comparative studies. This computational approach to identifying complex non-linear relationships represents a methodological innovation beyond traditional econometric techniques in Estonian and French studies. Another distinctive feature of this labor analysis is its detailed examination of the shifting composition of the workforce across wage bands. The [20K-40K] bracket is found to have grown from 5.85% (12.51% of gross income) of the population in 2001 to 15.61% (20.96% of gross income) in 2021, providing insights into redistributive mechanisms not explored in comparable depth in the Estonian or French studies. For policymakers, these studies collectively suggest that well-designed minimum wage policies can serve as an effective tool for reducing income inequality without necessarily having negative consequences for employment or economic growth. However, the contextual differences highlighted also underline the importance of calibrating such policies to specific national economic conditions rather than applying universal approaches across different settings.

4.2 Contextual Considerations and Future Directions

While the Spanish experience during 2001-2021 demonstrates positive outcomes from minimum wage increases, several contextual factors should be considered when interpreting these results. Spain’s economic structure, with significant employment in tourism and services, may exhibit different adjustment mechanisms than economies with larger manufacturing sectors or different labor market institutions. Crucially, Spain’s minimum wage remained substantially below levels observed in more developed European economies such as France or Germany for much of the study period. This differential likely created headroom for significant increases without triggering the adverse effects predicted by conventional economic models.

An important consideration for future policy is that as Spain’s minimum wage converges with those of countries like France and Germany, the economic effects may differ substantially from those observed in this study. The positive relationships identified between minimum wage increases, employment growth, and business profitability might not maintain their linear relationship as wage floors approach levels comparable to more advanced European economies. The unique structural characteristics of the Spanish economy—including its sectoral composition, productivity levels, and competitive position within European markets—may create different adjustment dynamics as convergence progresses. Therefore, while successive minimum wage increases might further improve redistributive outcomes in the near term, careful monitoring and analysis will be required to identify potential threshold effects or structural adjustments as Spain’s labor market institutions increasingly resemble those of its northern European counterparts.

5. Conclusion

This study examines the impact of National Minimum Wage (NMW) adjustments on income inequality and wealth redistribution in Spain over a significant twenty-year period (2001-2021), using comprehensive administrative tax records constituting a complete census of Spanish wage earners. The results offer several substantive contributions to the literature on minimum wage effects and challenge conventional economic predictions about their potential consequences.

5.1 Primary Empirical Contributions

The empirical evidence presented in this study shows that substantial increases in Spain’s minimum wage - from €505.7/month in 2001 to €1,108.3/month in 2021 - coincided with statistically significant reductions in income inequality. The machine learning models used particularly the Random Forest Regressor, identified a robust relationship between minimum wage increases and lower Gini coefficients, with the variable measuring shifts in income distribution across wage brackets emerging as the strongest predictor of inequality reduction.

Contrary to the predictions of classical labor market models, no evidence was found to link minimum wage increases with employment reductions or inflationary pressures over the period analyzed. Instead, the data reveal a pattern of increased net employment, particularly in the €20,000-€40,000 annual income bracket, which expanded from 5.85% of the labor force in 2001 to 15.61% in 2021. This structural shift in the Spanish wage distribution represents a significant improvement in labor market outcomes for low and middle-income workers.

Perhaps most notably, measures of business profitability remained robust throughout minimum wage increases, with corporate tax collections reaching record levels through 2019. This finding contradicts predictions that higher wage floors will necessarily squeeze business margins or reduce economic dynamism. Instead, the evidence suggests a mutually reinforcing relationship between wage growth and business performance under the conditions analyzed.

5.2 Theoretical and Policy Implications

The study’s results hold particular significance for policy debates regarding redistribution mechanisms. The evidence indicates that predistributive policies affecting market income directly (such as minimum wage regulation) can be highly effective tools for reducing inequality. Notably, these effects were achieved without requiring substantial increases in fiscal redistribution through the tax and transfer system, suggesting an economically and politically sustainable approach to addressing income disparities.

The effectiveness of Spain’s minimum wage increases appears linked to their positioning within the overall wage distribution. When the minimum wage reached approximately the 30-40th percentile of the earnings distribution (as observed in the later years of the study period), its redistributive impact was maximized. This finding offers potential guidance for calibrating minimum wage policies in other contexts, suggesting an optimal range for wage floor positioning relative to the median wage.

In conclusion, this analysis of Spain’s experience from 2001 to 2021 provides compelling evidence that minimum wage policies, when appropriately designed and implemented, can function as practical tools for reducing income inequality while supporting broader economic prosperity. The findings challenge pessimistic predictions regarding employment and inflation effects while highlighting the potential of labor market institutions to promote more equitable income distribution without compromising economic performance.

The comprehensiveness of the administrative data and the sophisticated analytical techniques applied in this study offer methodological advantages over much previous research, providing a robust empirical foundation for these conclusions. While acknowledging the contextual specificities of the Spanish case, the findings contribute valuable insights to ongoing academic and policy debates regarding optimal approaches to addressing rising inequality in contemporary market economies.

6. Limitations of the Study

Several methodological and conceptual limitations should be considered when interpreting the results of this study. These constraints do not invalidate the conclusions drawn but rather delineate the boundaries within which these findings can be generalized and applied.

6.1 Restricted Definition of Income and Wealth

The first limitation concerns the restricted definition of income used in this analysis. The study examines only labor income reported to the Spanish Tax Agency (AEAT), excluding other significant household wealth and income components. In their comprehensive analysis of the distribution of wealth in Spain, they show that non-labor income - particularly capital and property - accounts for a substantial share of total income, especially for higher-income households. Excluding these sources of income may give an incomplete picture of the overall patterns of wealth redistribution [28].

Moreover, Spain has experienced significant changes in the composition of wealth over the period studied, with financial assets becoming increasingly important relative to real estate, especially among the higher wealth segments. By focusing exclusively on labor income, our analysis cannot capture these evolving patterns of wealth diversification and their impact on overall inequality [29]. A distributional national accounts methodology suggests that including all sources of income typically reveals higher levels of inequality than those observed through labor income alone.

6.2 Absence of Socio-Political Context Analysis

A second major limitation is the lack of a comprehensive analysis of the socio-political contexts in which minimum wage policies have been implemented. It is argued that the redistributive effects of labor market institutions cannot be fully understood in isolation from broader political dynamics and welfare state structures. Our study does not consider concurrent policy changes in areas such as taxation, social transfers, housing policy, or education, which may complement or counteract the effects of minimum wage increases.

It has been shown that the effectiveness of redistributive labor market policies depends significantly on their embeddedness within a broader institutional framework. The same formal policy may have different distributive outcomes depending on the institutional context in which it operates. Our analysis cannot determine the extent to which the observed effects of minimum wage increases depended on complementary institutional arrangements in Spain during the period studied [30].

6.3 Macroeconomic Interaction Effects

The study does not fully explore the complex interactions between GDP growth, inflation (as measured by the Consumer Price Index), and minimum wage effects. While our graphical analysis identifies correlations between these variables, more sophisticated macroeconomic modeling would be required to isolate causal relationships and address potential endogeneity issues. In addition, we do not account for sectoral differences in technology adoption and automation potential, which may have led to heterogeneous effects across industries and occupational categories.

The relationship between minimum wage increases and productivity — a crucial factor in determining sustainability — receives insufficient attention in our analysis. It does not explicitly measure such productivity adaptations, which may be critical in explaining why employment effects remained positive despite significant wage increases.

6.4 Temporal Limitations and External Validity

The study covers 2001-2021, including expansionary and recessionary periods in the Spanish economy. However, the most significant minimum wage increases occurred relatively late, especially from 2019 onwards. This temporal concentration may limit our ability to fully assess long-term adjustment processes and delayed effects that may emerge over more extended periods, as suggested by [4] in their analysis of lagged minimum wage effects. Moreover, the COVID-19 pandemic created exceptional economic circumstances in the last years of our study period. Thus, our findings on recent minimum wage increases should be interpreted cautiously regarding their generalizability to non-crisis contexts. The external validity of our results for other national contexts remains uncertain.

6.5 Future Research Directions

These limitations suggest several promising directions for future research. Studies incorporating comprehensive measures of income-including capital, property, and transfer income-would provide a more complete assessment of how minimum wage policies affect overall economic inequality. Examining regional and demographic heterogeneity would improve understanding of the distributional effects across different population segments and geographic areas. More sophisticated macroeconomic modeling, possibly using structural approaches to better account for endogeneity and general equilibrium effects, would strengthen the causal identification of minimum wage effects. Finally, extending the analysis to years beyond 2021 would allow for an assessment of whether the observed patterns persist as the economy recovers from pandemic-related disruptions and adjusts to persistently higher minimum wage levels. Despite these limitations, the study makes an essential contribution by using comprehensive administrative data to examine the effects of minimum wages on labor income inequality and by demonstrating patterns that challenge conventional economic predictions about employment and firm performance outcomes.

Author Contributions

The author did all the research work for this study.

Competing Interests

The authors have declared that no competing interests exist.

References

  1. Lee DS. Wage inequality in the United States during the 1980s: Rising dispersion or falling minimum wage? Q J Econ. 1999; 114: 977-1023. [CrossRef]
  2. DiNardo J, Fortin N, Lemieux T. Labor market institutions and the distribution of wages, 1973-1992: A semiparametric approach. Econometrica. 1996; 64: 1001-1044. [CrossRef]
  3. Autor DH, Manning A, Smith CL. The contribution of the minimum wage to US wage inequality over three decades: A reassessment. Am Econ J Appl Econ. 2016; 8: 58-99. [CrossRef]
  4. Cengiz D, Dube A, Lindner A, Zipperer B. The effect of minimum wages on low-wage jobs. Q J Econ. 2019; 134: 1405-1454. [CrossRef]
  5. Garnero A, Kampelmann S, Rycx F. Minimum wage systems and earnings inequalities: Does institutional diversity matter? Eur J Ind Relat. 2015; 21: 115-130. [CrossRef]
  6. Clemens J. How do firms respond to minimum wage increases? Understanding the relevance of non-employment margins. J Econ Perspect. 2021; 35: 51-72. [CrossRef]
  7. Lopresti JW, Mumford KJ. Who benefits from a minimum wage increase? ILR Rev. 2016; 69: 1171-1190. [CrossRef]
  8. Neumark D, Schweitzer M, Wascher W. Minimum wage effects throughout the wage distribution. J Hum Resour. 2004; 39: 425-450. [CrossRef]
  9. Sellekaerts B. Effect of the minimum wage on inflation and other key macroeconomic variables. East Econ J. 1982; 8: 177-190.
  10. Dustmann C, Lindner A, Schönberg U, Umkehrer M, Vom Berge P. Reallocation effects of the minimum wage. Q J Econ. 2022; 137: 267-328. [CrossRef]
  11. Kemal BM, Kocaman M. The impact of minimum wage on unemployment, prices, and growth: A multivariate analysis for Turkey. Econ Ann. 2019; 64: 65-83. [CrossRef]
  12. Lacuesta A, Izquierdo M, Puente S. Un análisis del impacto de la subida del salario mínimo interprofesional en 2017 sobre la probabilidad de perder empleo. Madrid, Spain: Banco de España; 2019. [CrossRef]
  13. Ferraro S, Meriküll J, Staehr K. Minimum wages and the wage distribution in Estonia. Appl Econ. 2018; 50: 5253-5268. [CrossRef]
  14. Bozio A, Garbinti B, Goupille-Lebret J, Guillot M, Piketty T. Predistribution versus redistribution: Evidence from France and the United States. Am Econ J Appl Econ. 2024; 16: 31-65. [CrossRef]
  15. Newman M. Networks. New York, NY: Oxford University Press; 2018.
  16. Suchowski MA. An analysis of the impact of an outlier on correlation coefficients across small sample data where RHO is non-zero. Kalamazoo, MI: Western Michigan University; 2001.
  17. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech. 2008; 2008: P10008. [CrossRef]
  18. Fortunato S, Hric D. Community detection in networks: A user guide. Phys Rep. 2016; 659: 1-44. [CrossRef]
  19. Jacomy M, Venturini T, Heymann S, Bastian M. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS One. 2014; 9: e98679. [CrossRef]
  20. Seabold S, Perktold J. Statsmodels: Econometric and statistical modeling with python. SciPy. 2010; 7: 92-96. [CrossRef]
  21. Breiman L. Random forests. Mach Learn. 2001; 45: 5-32. [CrossRef]
  22. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011; 12: 2825-2830.
  23. Cabrer-Borras B, Rico P. Impacto económico del sector turístico en España. Stud Appl Econ. 2021; 39. doi: 10.25115/eea.v39i2.3599. [CrossRef]
  24. Molinero-Gerbeau Y, López-Sala A, Șerban M. On the social sustainability of industrial agriculture dependent on migrant workers. Romanian workers in Spain’s seasonal agriculture. Sustainability. 2021; 13: 1062. [CrossRef]
  25. Stewart MB. Estimating the impact of the minimum wage using geographical wage variation. Oxf Bull Econ Stat. 2002; 64: 583-605. [CrossRef]
  26. Andersen R. Support for democracy in cross-national perspective: The detrimental effect of economic inequality. Res Soc Stratif Mobil. 2012; 30: 389-402. [CrossRef]
  27. Clemens M, Eydam U, Heinemann M. Inequality over the business cycle: The role of distributive shocks. Macroecon Dyn. 2023; 27: 571-600. [CrossRef]
  28. Saez E, Alvaredo F. Income and wealth concentration in Spain in a historical and fiscal perspective. Paris & London: CEPR Press; 2006; CEPR Discussion Paper No. 5836.
  29. Martínez-Toledano C. House price cycles, wealth inequality and portfolio reshuffling. Paris, France: WID; 2020; n° 2017/19.
  30. Pontusson J, Weisstanner D. Macroeconomic conditions, inequality shocks and the politics of redistribution, 1990–2013. J Eur Public Policy. 2018; 25: 31-58. [CrossRef]
Newsletter
Download PDF Download Citation
0 0

TOP