Revenue by Customer City
MediumPremium
Given the schema shown below, write a solution to fetch the total transaction revenue per user city, ordered by descending revenue (in USD).
users products +---------------+---------+ +-----------------+---------+ +--| id | int | +-----| id | int | | | first_name | varchar | | | name | varchar | | | last_name | varchar | | +->| product_line_id | date | | | user_city | int | | | | stock | int | | | email | int | | | +-----------------+---------+ | +---------------+---------+ | | | | | | transactions | | product_lines | +---------------+---------+ | | +--------+--------+ | | id | int |<----+ +--| id | int | +---->| customer_id | int | | name | varchar| | product_id | int | +--------+--------+ | amount | int | | currency_code | varchar | | date | date | +---------------+---------+ exchange_rate +----------------------+---------+ | id | int | | source_currency_code | varchar | | target_currency_code | varchar | | rate | numeric | +----------------------+---------+
Your answer should return a result with the following format:
user_city | total_revenue -----------+------------------------ varchar | float (no need to round)
To fetch the total transaction revenue per user city, ordered by descending revenue (in USD), we can do the following:
- Merge Transactions with Users: Combine
transactionswithusersto link each transaction to a user’s city. - Convert Transaction Amounts to USD: Use
exchange_rateto convert all transaction amounts to USD. Ensure that you handle transactions already in USD correctly. - Aggregate Revenue by City: Group by user city and sum up the revenues.
- Sort by Revenue: Order the results by descending revenue.
Pythondef find_revenue_by_city(transactions: pd.DataFrame,
users: pd.DataFrame,
exchange_rate: pd.DataFrame) -> pd.DataFrame:
# Merge Transactions with Users
transactions_with_users = pd.merge(transactions, users, left_on='customer_id', right_on='id')
# Merge non-USD transactions with exchange rates
transactions_with_exchange = pd.merge(
transactions_with_users,
exchange_rate[exchange_rate['target_currency_code'] == 'USD'],
how='left',
left_on='currency_code',
right_on='source_currency_code'
)
# Assign an exchange rate of 1.0 for USD transactions
transactions_with_exchange['rate'] = transactions_with_exchange['rate'].fillna(1.0)
# Calculate the amount in USD
transactions_with_exchange['amount_usd'] = transactions_with_exchange['amount'] * transactions_with_exchange['rate']
# Aggregate Revenue by City
revenue_by_city = transactions_with_exchange.groupby('user_city')['amount_usd'].sum().reset_index()
# Sort by Revenue
revenue_by_city = revenue_by_city.sort_values(by='amount_usd', ascending=False).reset_index(drop=True)
# Rename columns for clarity
revenue_by_city.columns = ['user_city', 'total_revenue']
return revenue_by_city
Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug?
def find_revenue_by_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: # gets user city for each user id user_ids = users[['id', 'user_city']] # and merge on transactions transactions = transactions.merge(user_ids, how='left', left_on = 'customer_id', right_on = 'id') # gets exchange rates with target == USD cur_usd = exchange_rate[exchange_rate['target_currency_code'] == 'USD']\ [['source_currency_code', 'rate']] transactions = transactions.merge(cur_usd, 'outer', left_on='currency_code', right_on='source_currency_code').\ fillna(1)[['user_city', 'amount', 'rate']] transactions['total_revenue'] = transactions['amount']*transactions['rate'] transactions = transactions.groupby(by='user_city')[['total_revenue']].sum().reset_index().sort_values('total_revenue', ascending=False) return transactions