TabularData

RSS for tag

Import, organize, and prepare a table of data to train a machine learning model.

Posts under TabularData tag

6 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

TabularData Resources
TabularData framework lets you import, organize, and export a table of data. It’s great when you’re training a machine learning model but it’s a handy tool in many other scenarios as well. General: DevForums tag: TabularData TabularData framework documentation Explore and manipulate data in Swift with TabularData tech talk For a ‘hello world’ style example, see this DevForums post Share and Enjoy — Quinn “The Eskimo!” @ Developer Technical Support @ Apple let myEmail = "eskimo" + "1" + "@" + "apple.com"
0
0
1.2k
Mar ’23
FB13516799: Training Tabular Regression ML Models on large datasets in Xcode 15 continuously "Processing"
Hi, In Xcode 14 I was able to train linear regression models with Create ML using large CSV files (I tested on about 30000 items and 5 features): However, in Xcode 15 (I tested on 15.0.1 and 15.1), the training continuously stays in the "Processing" state: When using a dataset with 900 items, everything works fine. I filed a feedback for this issue: FB13516799. Does anybody else have this issue / can reproduce it?
2
0
786
Jan ’24
Is there a way to apply for formatting option to a Dataframe column outside of the explicit description(options:) method?
I'm building up a data frame for the sole purpose of using that lovely textual grid output. I'm getting output without any issue, but I'm trying to sort out how I might apply a formatter to a specific column so that print(dataframeInstance) "just works" nicely. In my use case, I'm running a function, collecting its output - appending that into a frame, and then using TabularData to get a nice output in a unit test, so I can see the patterns within the output. I found https://developer.apple.com/documentation/tabulardata/column/description(options:), but wasn't able to find any way to "pre-bind" that to a dataframe Column when I was creating it. (I have some double values that get a bit "excessive" in length due to the joys of floating point rounding) Is there a way of setting a formatter on a column at creation time, or after (using a property) that could basically use the same pattern as that description method above?
1
0
594
Nov ’23
TabluarData DataFrame removing row results in EXC_BAD_ACCESS
I am working with data in Swift using the TabularData framework. I load data from a CSV file into a DataFrame, then copy the data into a second DataFrame, and finally remove a row from the second DataFrame. The problem arises when I try to remove a row from the second DataFrame, at which point I receive an EXC_BAD_ACCESS error. However, if I modify the "timings" column (the final column) before removing the row (even to an identical value), the code runs without errors. Interestingly, this issue only occurs when a row in the column of the CSV file contains more than 15 characters. This is the code I'm using: func loadCSV() { let documentsDirectory = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first! let url = documentsDirectory.appendingPathComponent("example.csv") var dataframe: DataFrame do { dataframe = try .init( contentsOfCSVFile: url, columns: ["user", "filename", "syllable count", "timings"], types: ["user": .string, "filename": .string, "syllable count": .integer, "timings": .string] ) } catch { fatalError("Failed to load csv data") } print("First data frame",dataframe, separator: "\n") /// This works var secondFrame = DataFrame() secondFrame.append(column: Column<String>(name: "user", capacity: 1000)) secondFrame.append(column: Column<String>(name: "filename", capacity: 1000)) secondFrame.append(column: Column<Int>(name: "syllable count", capacity: 1000)) secondFrame.append(column: Column<String>(name: "timings", capacity: 1000)) for row in 0..<dataframe.rows.count { secondFrame.appendEmptyRow() for col in 0..<4 { secondFrame.rows[row][col] = dataframe.rows[row][col] } } // secondFrame.rows[row][3, String.self] = String("0123456789ABCDEF") /* If we include this line, it will not crash, even though the content is the same */ print("Second data frame before removing row",dataframe, separator: "\n") // Before removal secondFrame.removeRow(at: 0) print("Second data frame after removing row",dataframe, separator: "\n") // After removal—we will get Thread 1: EXC_BAD_ACCESS here. The line will still print, however } and the csv (minimal example): user,filename,syllable count,timings john,john-001,12,0123456789ABCDEF jane,jane-001,10,0123456789ABCDE I've been able to replicate this bug on macOS and iOS using minimal projects. I'm unsure why this error is occurring and why modifying the "timings" column prevents it. It should be noted that this same error occurs with a single data frame loaded from a CSV file, which means that I basically cannot load from CSV if I want to modify the DataFrame afterwards.
2
1
748
Aug ’23
Issue inserting row in TabularData's DataFrame
I'm fairly new to Swift programming so I might be overlooking something, but I'm puzzled why the following code doesn't properly insert a row in a DataFrame. The goal is to move a row at a given index to a new index. I would normally: Copy the row that I want to move Remove the row from the original dataset Insert the copy to the new position The CSV I'm using is from Wikipedia: Year,Make,Model,Description,Price 1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""","",4900.00 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 My code (Swift playground): import Foundation import TabularData let fileUrl = Bundle.main.url(forResource: "data", withExtension: "csv") let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",") var dataFrame = try! DataFrame(contentsOfCSVFile: fileUrl!, options: options) print("Original data") print(dataFrame) let rowToMove: Int = 2 let row = dataFrame.rows[rowToMove] print("Row to move") print(row) dataFrame.removeRow(at: rowToMove) print("After removing") print(dataFrame) dataFrame.insert(row: row, at: 0) print("After inserting") print(dataFrame) This results in the following: Original data ┏━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Year ┃ Make ┃ Model ┃ Description ┃ Price ┃ ┃ ┃ <Int> ┃ <String> ┃ <String> ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ 1,997 │ Ford │ E350 │ ac, abs, moon │ 3,000.0 │ │ 1 │ 1,999 │ Chevy │ Venture "Extended Edition" │ │ 4,900.0 │ │ 2 │ 1,999 │ Chevy │ Venture "Extended Edition, Very Large" │ │ 5,000.0 │ │ 3 │ 1,996 │ Jeep │ Grand Cherokee │ MUST SELL! air, moon roof, loaded │ 4,799.0 │ └───┴───────┴──────────┴────────────────────────────────────────┴───────────────────────────────────┴──────────┘ 4 rows, 5 columns Row to move ┏━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Year ┃ Make ┃ Model ┃ Description ┃ Price ┃ ┃ ┃ <Int> ┃ <String> ┃ <String> ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ 2 │ 1,999 │ Chevy │ Venture "Extended Edition, Very Large" │ │ 5,000.0 │ └───┴───────┴──────────┴────────────────────────────────────────┴─────────────┴──────────┘ 1 row, 5 columns After removing ┏━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Year ┃ Make ┃ Model ┃ Description ┃ Price ┃ ┃ ┃ <Int> ┃ <String> ┃ <String> ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ 1,997 │ Ford │ E350 │ ac, abs, moon │ 3,000.0 │ │ 1 │ 1,999 │ Chevy │ Venture "Extended Edition" │ │ 4,900.0 │ │ 2 │ 1,996 │ Jeep │ Grand Cherokee │ MUST SELL! air, moon roof, loaded │ 4,799.0 │ └───┴───────┴──────────┴────────────────────────────┴───────────────────────────────────┴──────────┘ 3 rows, 5 columns After inserting ┏━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Year ┃ Make ┃ Model ┃ Description ┃ Price ┃ ┃ ┃ <Int> ┃ <String> ┃ <String> ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ 1,999 │ Chevy │ │ │ 5,000.0 │ │ 1 │ 1,997 │ Ford │ E350 │ ac, abs, moon │ 3,000.0 │ │ 2 │ 1,996 │ Jeep │ Grand Cherokee │ MUST SELL! air, moon roof, loaded │ 4,799.0 │ │ 3 │ nil │ nil │ nil │ nil │ nil │ └───┴───────┴──────────┴────────────────────────────────────────┴───────────────────────────────────┴──────────┘ 4 rows, 5 columns Everything is fine up until inserting. I spot a few issues: A row gets deleted (original data row 1) A row filled with nil's is added (at index 3) the row I want to insert isn't properly inserted (notice how the 'model' text has gone). I assume I'm missing something - does it have to do with the row copy keeping its index (2)? How can I fix this?
4
0
822
Jul ’23
error: cannot find 'MLDataTable' in scope
I have tried multiple playgrounds and consistently get the same error in any playground I create. There is a tabular data playground that does work but I see nothing I am not doing. Here is the code that fails with Error: cannot find 'MLDataTable' in scope /* code start */ import CoreML import Foundation import TabularData let jsonFile = Bundle.main.url(forResource: "sentiment_analysis", withExtension: "json")! let tempTable = try DataTable let dataTable = try MLDataTable(contentsOf: jsonFile) print(dataTable) /* code end */
3
0
907
Jul ’23