2017-07-21

Seleccionar un subconjunto de una base de datos sqlite con dplyr



Problema

Queremos extraer un subconjunto de filas de una base de datos sqlite con dplyr.

library(dplyr)
library(nycflights13)
flights_sqlite <- tbl(nycflights13_sqlite(), "flights")
flights_sqlite
Source:   query [?? x 19]
Database: sqlite 3.11.1 [C:\Users\Benjamin\AppData\Local\Temp\Rtmpsz2LSi/nycflights13.sqlite]

# A tibble: ?? x 19
    year month   day dep_time sched_dep_time dep_delay arr_time
                            
 1  2013     1     1      517            515         2      830
 2  2013     1     1      533            529         4      850
 3  2013     1     1      542            540         2      923
 4  2013     1     1      544            545        -1     1004
 5  2013     1     1      554            600        -6      812
 6  2013     1     1      554            558        -4      740
 7  2013     1     1      555            600        -5      913
 8  2013     1     1      557            600        -3      709
 9  2013     1     1      557            600        -3      838
10  2013     1     1      558            600        -2      753
# ... with more rows, and 12 more variables: sched_arr_time ,
#   arr_delay , carrier , flight , tailnum ,
#   origin , dest , air_time , distance , hour ,
#   minute , time_hour 

Al se un objeto de clase tbl_sqlite, la función slice genera un error.

slice(flights_sqlite, 10)
Error in UseMethod("slice_") : 
  no applicable method for 'slice_' applied to an object of class "c('tbl_sqlite', 'tbl_sql', 'tbl_lazy', 'tbl')

Solución

  • Primeras n filas
  • head(flights_sqlite, 10) %>% collect() 
    
    # A tibble: 10 x 19
        year month   day dep_time sched_dep_time dep_delay arr_time
                                
     1  2013     1     1      517            515         2      830
     2  2013     1     1      533            529         4      850
     3  2013     1     1      542            540         2      923
     4  2013     1     1      544            545        -1     1004
     5  2013     1     1      554            600        -6      812
     6  2013     1     1      554            558        -4      740
     7  2013     1     1      555            600        -5      913
     8  2013     1     1      557            600        -3      709
     9  2013     1     1      557            600        -3      838
    10  2013     1     1      558            600        -2      753
    # ... with 12 more variables: sched_arr_time , arr_delay ,
    #   carrier , flight , tailnum , origin , dest ,
    #   air_time , distance , hour , minute ,
    #   time_hour 
    

Entradas relacionadas

Referencias

No hay comentarios:

Publicar un comentario

Nube de datos