
Write a code to remove duplicate records based on a specific column in PySpark, keeping only rows with unique values for that column?
Example: Original +———-+————+———-+——+ |EmployeeID|EmployeeName|Department|Salary| +———-+————+———-+——+ | 1| John Doe| Finance| 55000| | 2| Jane Smith| IT| 75000| | 3| Sam Brown| HR| 55000| | 4| Emily Davis| IT| 80000| +———-+————+———-+——+